This is a draft document that was built and uploaded automatically. It may document beta software and be incomplete or even incorrect. Use this document at your own risk.

Jump to contentJump to page navigation: previous page [access key p]/next page [access key n]
Administration Guide / Cluster Management / Salt Cluster Administration
Applies to SUSE Enterprise Storage 5.5 (SES 5 & SES 5.5)

1 Salt Cluster Administration

After you deploy a Ceph cluster, you will probably need to perform several modifications to it occasionally. These include adding or removing new nodes, disks, or services. This chapter describes how you can achieve these administration tasks.

1.1 Adding New Cluster Nodes

The procedure of adding new nodes to the cluster is almost identical to the initial cluster node deployment described in Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”:

Tip
Tip: Prevent Rebalancing

When adding an OSD to the existing cluster, bear in mind that the cluster will be rebalancing for some time afterward. To minimize the rebalancing periods, add all OSDs you intend to add at the same time.

Additional way is to set the osd crush initial weight = 0 option in the ceph.conf file before adding the OSDs:

  1. Add osd crush initial weight = 0 to /srv/salt/ceph/configuration/files/ceph.conf.d/global.conf.

  2. Create the new configuration:

    root@master # salt MASTER state.apply ceph.configuration.create

    Or:

    root@master # salt-call state.apply ceph.configuration.create
  3. Apply the new configuration:

    root@master # salt TARGET state.apply ceph.configuration
    Note
    Note

    If this is not a new node, but you want to proceed as if it were, ensure you remove the /etc/ceph/destroyedOSDs.yml file from the node. Otherwise, any devices from the first attempt will be restored with their previous OSD ID and reweight.

    Run the following commands:

    root@master # salt-run state.orch ceph.stage.1
    root@master # salt-run state.orch ceph.stage.2
    root@master # salt 'node*' state.apply ceph.osd
  4. After the new OSDs are added, adjust their weights as required with the ceph osd reweight command in small increments. This allows the cluster to rebalance and become healthy between increasing increments so it does not overwhelm the cluster and clients accessing the cluster.

  1. Install SUSE Linux Enterprise Server 12 SP3 on the new node and configure its network setting so that it resolves the Salt master host name correctly. Verify that it has a proper connection to both public and cluster networks, and that time synchronization is correctly configured. Then install the salt-minion package:

    root@minion > zypper in salt-minion

    If the Salt master's host name is different from salt, edit /etc/salt/minion and add the following:

    master: DNS_name_of_your_salt_master

    If you performed any changes to the configuration files mentioned above, restart the salt.minion service:

    root@minion > systemctl restart salt-minion.service
  2. On the Salt master, accept the salt key of the new node:

    root@master # salt-key --accept NEW_NODE_KEY
  3. Verify that /srv/pillar/ceph/deepsea_minions.sls targets the new Salt minion and/or set the proper DeepSea grain. Refer to Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.2.2.1 “Matching the Minion Name” of Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.3 “Cluster Deployment”, Running Deployment Stages for more details.

  4. Run the preparation stage. It synchronizes modules and grains so that the new minion can provide all the information DeepSea expects.

    root@master # salt-run state.orch ceph.stage.0
    Important
    Important: Possible Restart of DeepSea Stage 0

    If the Salt master rebooted after its kernel update, you need to restart DeepSea Stage 0.

  5. Run the discovery stage. It will write new file entries in the /srv/pillar/ceph/proposals directory, where you can edit relevant .yml files:

    root@master # salt-run state.orch ceph.stage.1
  6. Optionally, change /srv/pillar/ceph/proposals/policy.cfg if the newly added host does not match the existing naming scheme. For details, refer to Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.5.1 “The policy.cfg File”.

  7. Run the configuration stage. It reads everything under /srv/pillar/ceph and updates the pillar accordingly:

    root@master # salt-run state.orch ceph.stage.2

    Pillar stores data which you can access with the following command:

    root@master # salt target pillar.items
  8. The configuration and deployment stages include newly added nodes:

    root@master # salt-run state.orch ceph.stage.3
    root@master # salt-run state.orch ceph.stage.4

1.2 Adding New Roles to Nodes

You can deploy all types of supported roles with DeepSea. See Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.5.1.2 “Role Assignment” for more information on supported role types and examples of matching them.

To add a new service to an existing node, follow these steps:

  1. Adapt /srv/pillar/ceph/proposals/policy.cfg to match the existing host with a new role. For more details, refer to Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.5.1 “The policy.cfg File”. For example, if you need to run an Object Gateway on a MON node, the line is similar to:

    role-rgw/xx/x/example.mon-1.sls
  2. Run Stage 2 to update the pillar:

    root@master # salt-run state.orch ceph.stage.2
  3. Run Stage 3 to deploy core services, or Stage 4 to deploy optional services. Running both stages does not hurt.

1.3 Removing and Reinstalling Cluster Nodes

Tip
Tip: Removing a Cluster Node Temporarily

The Salt master expects all minions to be present in the cluster and responsive. If a minion breaks and is not responsive any more, it causes problems to the Salt infrastructure, mainly to DeepSea and openATTIC.

Before you fix the minion, delete its key from the Salt master temporarily:

root@master # salt-key -d MINION_HOST_NAME

After the minions is fixed, add its key to the Salt master again:

root@master # salt-key -a MINION_HOST_NAME

To remove a role from a cluster, edit /srv/pillar/ceph/proposals/policy.cfg and remove the corresponding line(s). Then run Stages 2 and 5 as described in Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.3 “Cluster Deployment”.

Note
Note: Removing OSDs from Cluster

In case you need to remove a particular OSD node from your cluster, ensure that your cluster has more free disk space than the disk you intend to remove. Bear in mind that removing an OSD results in rebalancing of the whole cluster.

Before running stage.5 to do the actual removal, always check which OSD's are going to be removed by DeepSea:

root@master # salt-run rescinded.ids

When a role is removed from a minion, the objective is to undo all changes related to that role. For most of the roles, the task is simple, but there may be problems with package dependencies. If a package is uninstalled, its dependencies are not.

Removed OSDs appear as blank drives. The related tasks overwrite the beginning of the file systems and remove backup partitions in addition to wiping the partition tables.

Note
Note: Preserving Partitions Created by Other Methods

Disk drives previously configured by other methods, such as ceph-deploy, may still contain partitions. DeepSea will not automatically destroy these. The administrator must reclaim these drives manually.

Example 1.1: Removing a Salt minion from the Cluster

If your storage minions are named, for example, 'data1.ceph', 'data2.ceph' ... 'data6.ceph', and the related lines in your policy.cfg are similar to the following:

[...]
# Hardware Profile
profile-default/cluster/data*.sls
profile-default/stack/default/ceph/minions/data*.yml
[...]

Then to remove the Salt minion 'data2.ceph', change the lines to the following:

[...]
# Hardware Profile
profile-default/cluster/data[1,3-6]*.sls
profile-default/stack/default/ceph/minions/data[1,3-6]*.yml
[...]

Then run stage.2, check which OSD's are going to be removed, and finish by running stage.5:

root@master # salt-run state.orch ceph.stage.2
root@master # salt-run rescinded.ids
root@master # salt-run state.orch ceph.stage.5
Example 1.2: Migrating Nodes

Assume the following situation: during the fresh cluster installation, you (the administrator) allocated one of the storage nodes as a stand-alone Object Gateway while waiting for the gateway's hardware to arrive. Now the permanent hardware has arrived for the gateway and you can finally assign the intended role to the backup storage node and have the gateway role removed.

After running Stages 0 and 1 (see Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.3 “Cluster Deployment”, Running Deployment Stages) for the new hardware, you named the new gateway rgw1. If the node data8 needs the Object Gateway role removed and the storage role added, and the current policy.cfg looks like this:

# Hardware Profile
profile-default/cluster/data[1-7]*.sls
profile-default/stack/default/ceph/minions/data[1-7]*.sls

# Roles
role-rgw/cluster/data8*.sls

Then change it to:

# Hardware Profile
profile-default/cluster/data[1-8]*.sls
profile-default/stack/default/ceph/minions/data[1-8]*.sls

# Roles
role-rgw/cluster/rgw1*.sls

Run stages 2 to 4, check which OSD's are going to be possibly removed, and finish by running stage.5. Stage 3 will add data8 as a storage node. For a moment, data8 will have both roles. Stage 4 will add the Object Gateway role to rgw1 and stage 5 will remove the Object Gateway role from data8:

root@master # salt-run state.orch ceph.stage.2
root@master # salt-run state.orch ceph.stage.3
root@master # salt-run state.orch ceph.stage.4
root@master # salt-run rescinded.ids
root@master # salt-run state.orch ceph.stage.5

1.4 Redeploying Monitor Nodes

When one or more of your monitor nodes fail and are not responding, you need to remove the failed monitors from the cluster and possibly then re-add them back in the cluster.

Important
Important: The Minimum is Three Monitor Nodes

The number of monitor nodes must not be less than three. If a monitor node fails, and as a result your cluster has only two monitor nodes only, you need to temporarily assign the monitor role to other cluster nodes before you redeploy the failed monitor nodes. After you redeploy the failed monitor nodes, you can uninstall the temporary monitor roles.

For more information on adding new nodes/roles to the Ceph cluster, see Section 1.1, “Adding New Cluster Nodes” and Section 1.2, “Adding New Roles to Nodes”.

For more information on removing cluster nodes, refer to Section 1.3, “Removing and Reinstalling Cluster Nodes”.

There are two basic degrees of a Ceph node failure:

  • The Salt minion host is broken either physically or on the OS level, and does not respond to the salt 'minion_name' test.ping call. In such case you need to redeploy the server completely by following the relevant instructions in Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.3 “Cluster Deployment”.

  • The monitor related services failed and refuse to recover, but the host responds to the salt 'minion_name' test.ping call. In such case, follow these steps:

  1. Edit /srv/pillar/ceph/proposals/policy.cfg on the Salt master, and remove or update the lines that correspond to the failed monitor nodes so that they now point to the working monitor nodes. For example:

    [...]
    # MON
    #role-mon/cluster/ses-example-failed1.sls
    #role-mon/cluster/ses-example-failed2.sls
    role-mon/cluster/ses-example-new1.sls
    role-mon/cluster/ses-example-new2.sls
    [...]
  2. Run DeepSea Stages 2 to 5 to apply the changes:

    root@master # deepsea stage run ceph.stage.2
    root@master # deepsea stage run ceph.stage.3
    root@master # deepsea stage run ceph.stage.4
    root@master # deepsea stage run ceph.stage.5

1.5 Adding an OSD Disk to a Node

To add a disk to an existing OSD node, verify that any partition on the disk was removed and wiped. Refer to Step 12 in Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.3 “Cluster Deployment” for more details. After the disk is empty, add the disk to the YAML file of the node. The path to the file is /srv/pillar/ceph/proposals/profile-default/stack/default/ceph/minions/node_name.yml. After saving the file, run DeepSea stages 2 and 3:

root@master # salt-run state.orch ceph.stage.2
root@master # salt-run state.orch ceph.stage.3
Tip
Tip: Updated Profiles Automatically

Instead of manually editing the YAML file, DeepSea can create new profiles. To let DeepSea create new profiles, the existing profiles need to be moved:

root@master # old /srv/pillar/ceph/proposals/profile-default/
root@master # salt-run state.orch ceph.stage.1
root@master # salt-run state.orch ceph.stage.2
root@master # salt-run state.orch ceph.stage.3

We recommend verifying the suggested proposals before deploying the changes. Refer to Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.5.1.4 “Profile Assignment” for more details on viewing proposals.

1.6 Removing an OSD

You can remove an Ceph OSD from the cluster by running the following command:

root@master # salt-run disengage.safety
root@master # salt-run remove.osd OSD_ID

OSD_ID needs to be a number of the OSD without the osd. prefix. For example, from osd.3 only use the digit 3.

1.6.1 Removing Multiple OSDs

Use the same procedure as mentioned in Section 1.6, “Removing an OSD” but simply supply multiple OSD IDs:

root@master # salt-run disengage.safety
safety is now disabled for cluster ceph

root@master # salt-run remove.osd 1 13 20
Removing osds 1, 13, 20 from minions
Press Ctrl-C to abort
Removing osd 1 from minion data4.ceph
Removing osd 13 from minion data4.ceph
Removing osd 20 from minion data4.ceph
Removing osd 1 from Ceph
Removing osd 13 from Ceph
Removing osd 20 from Ceph
Important
Important: Removed OSD ID Still Present in grains

After the remove.osd command finishes, the ID of the removed OSD is still part of Salt grains and you can see it after running salt target osd.list. The reason is that if the remove.osd command partially fails on removing the data disk, the only reference to related partitions on the shared devices is in the grains. If we updated the grains immediately, then those partitions would be orphaned.

To update the grains manually, run salt target osd.retain. It is part of DeepSea Stage 3, therefore if you are going to run Stage 3 after the OSD removal, the grains get updated automatically.

Tip
Tip: Automatic Retries

You can append the timeout parameter (in seconds) after which Salt retries the OSD removal:

root@master # salt-run remove.osd 20 timeout=6
Removing osd 20 from minion data4.ceph
  Timeout expired - OSD 20 has 22 PGs remaining
Retrying...
Removing osd 20 from Ceph

1.6.2 Removing Broken OSDs Forcefully

There are cases when removing an OSD gracefully (see Section 1.6, “Removing an OSD”) fails. This may happen for example if the OSD or its journal, Wall or DB are broken, when it suffers from hanging I/O operations, or when the OSD disk fails to unmount. In such case, you need to force the OSD removal. The following command removes both the data partition, and the journal or WAL/DB partitions:

root@master # salt target osd.remove OSD_ID force=True
Tip
Tip: Hanging Mounts

If a partition is still mounted on the disk being removed, the command will exit with the 'Unmount failed - check for processes on DEVICE' message. You can then list all processes that access the file system with the fuser -m DEVICE. If fuser returns nothing, try manual unmount DEVICE and watch the output of dmesg or journalctl commands.

1.7 Replacing an OSD Disk

There are several reasons why you may need to replace an OSD disk, for example:

  • The OSD disk failed or is soon going to fail based on SMART information, and can no longer be used to store data safely.

  • You need to upgrade the OSD disk, for example to increase its size.

The replacement procedure is the same for both cases. It is also valid for both default and customized CRUSH Maps.

Warning
Warning: The Number of Free Disks

When doing an automated OSDs replacement, the number of free disks needs to be the same as the number of disks you need to replace. If there are more free disks available in the system, it is impossible to guess which free disks to replace. Therefore the automated replacement will not be performed.

  1. Turn off safety limitations temporarily:

    root@master # salt-run disengage.safety
  2. Suppose that for example '5' is the ID of the OSD whose disk needs to be replaced. The following command marks it as destroyed in the CRUSH Map but leaves its original ID:

    root@master # salt-run replace.osd 5
    Tip
    Tip: replace.osd and remove.osd

    The Salt's replace.osd and remove.osd (see Section 1.6, “Removing an OSD”) commands are identical except that replace.osd leaves the OSD as 'destroyed' in the CRUSH Map while remove.osd removes all traces from the CRUSH Map.

  3. Manually replace the failed/upgraded OSD drive.

  4. After replacing the physical drive, you need to modify the configuration of the related Salt minion. You can do so either manually or in an automated way.

    To manually change a Salt minion's configuration, see Section 1.7.1, “Manual Configuration”.

    To change a Salt minion's configuration in an automated way, see Section 1.7.2, “Automated Configuration”.

  5. After you finish either manual or automated configuration of the Salt minion, run DeepSea Stage 2 to update the Salt configuration. It prints out a summary about the differences between the storage configuration and the current setup:

    root@master # salt-run state.orch ceph.stage.2
    deepsea_minions          : valid
    yaml_syntax              : valid
    profiles_populated       : valid
    public network           : 172.16.21.0/24
    cluster network          : 172.16.22.0/24
    
    These devices will be deployed
    data1.ceph: /dev/sdb, /dev/sdc, /dev/sdd, /dev/sde, /dev/sdf, /dev/sdg
    Tip
    Tip: Run salt-run advise.osds

    To summarize the steps that will be taken when the actual replacement is deployed, you can run the following command:

    root@master # salt-run advise.osds
    These devices will be deployed
    
    data1.ceph:
      /dev/disk/by-id/cciss-3600508b1001c7c24c537bdec8f3a698f:
    
    Run 'salt-run state.orch ceph.stage.3'
  6. Run the deployment Stage 3 to deploy the replaced OSD disk:

    root@master # salt-run state.orch ceph.stage.3

1.7.1 Manual Configuration

  1. Find the renamed YAML file for the Salt minion. For example, the file for the minion named 'data1.ceph' is

    /srv/pillar/ceph/proposals/profile-PROFILE_NAME/stack/default/ceph/minions/data1.ceph.yml-replace
  2. Rename the file to its original name (without the -replace suffix), edit it, and replace the old device with the new device name.

    Tip
    Tip: salt osd.report

    Consider using salt 'MINION_NAME' osd.report to identify the device that has been removed.

    For example, if the data1.ceph.yml file contains

    ceph:
      storage:
        osds:
          [...]
          /dev/disk/by-id/cciss-3600508b1001c93595b70bd0fb700ad38:
            format: bluestore
          [...]

    replace the corresponding device path with

    ceph:
      storage:
        osds:
          [...]
          /dev/disk/by-id/cciss-3600508b1001c7c24c537bdec8f3a698f:
            format: bluestore
            replace: True
          [...]

1.7.2 Automated Configuration

While the default profile for Stage 1 may work for the simplest setups, this stage can be optionally customized:

  1. Set the stage_discovery: CUSTOM_STAGE_NAME option in /srv/pillar/ceph/stack/global.yml.

  2. Create the corresponding file /srv/salt/ceph/stage/1/CUSTOM_STAGE_NAME.sls and customize it to reflect your specific requirements for Stage 1. See Appendix A, DeepSea Stage 1 Custom Example for an example.

    Tip
    Tip: Inspect init.sls

    Inspect the /srv/salt/ceph/stage/1/init.sls file to see what variables you can use in your custom Stage 1 .sls file.

  3. Refresh the Pillar:

    root@master # salt '*' saltutil.pillar_refresh
  4. Run Stage 1 to generate the new configuration file:

    root@master # salt-run state.orch ceph.stage.1
Tip
Tip: Custom Options

To list all available options, inspect the output of the salt-run proposal.help command.

If you customized the cluster deployment with a specific command

salt-run proposal.populate OPTION=VALUE

use the same configuration when doing the automated configuration.

1.8 Recovering a Reinstalled OSD Node

If the operating system breaks and is not recoverable on one of your OSD nodes, follow these steps to recover it and redeploy its OSD role with cluster data untouched:

  1. Reinstall the base SUSE Linux Enterprise operating system on the node where the OS broke. Install the salt-minion packages on the OSD node, delete the old Salt minion key on the Salt master, and register the new Salt minion's key it with the Salt master. For more information on the initial deployment, see Book “Deployment Guide”, Chapter 4 “Deploying with DeepSea/Salt”, Section 4.3 “Cluster Deployment”.

  2. Instead of running the whole of Stage 0, run the following parts:

    root@master # salt 'osd_node' state.apply ceph.sync
    root@master # salt 'osd_node' state.apply ceph.packages.common
    root@master # salt 'osd_node' state.apply ceph.mines
    root@master # salt 'osd_node' state.apply ceph.updates
  3. Run DeepSea Stages 1 to 5:

    root@master # salt-run state.orch ceph.stage.1
    root@master # salt-run state.orch ceph.stage.2
    root@master # salt-run state.orch ceph.stage.3
    root@master # salt-run state.orch ceph.stage.4
    root@master # salt-run state.orch ceph.stage.5
  4. Run DeepSea Stage 0:

    root@master # salt-run state.orch ceph.stage.0
  5. Reboot the relevant OSD node. All OSD disks will be rediscovered and reused.

1.9 Automated Installation via Salt

The installation can be automated by using the Salt reactor. For virtual environments or consistent hardware environments, this configuration will allow the creation of a Ceph cluster with the specified behavior.

Warning
Warning

Salt cannot perform dependency checks based on reactor events. There is a real risk of putting your Salt master into a death spiral.

The automated installation requires the following:

  • A properly created /srv/pillar/ceph/proposals/policy.cfg.

  • Prepared custom configuration placed to the /srv/pillar/ceph/stack directory.

The default reactor configuration will only run Stages 0 and 1. This allows testing of the reactor without waiting for subsequent stages to complete.

When the first salt-minion starts, Stage 0 will begin. A lock prevents multiple instances. When all minions complete Stage 0, Stage 1 will begin.

If the operation is performed properly, edit the file

/etc/salt/master.d/reactor.conf

and replace the following line

- /srv/salt/ceph/reactor/discovery.sls

with

- /srv/salt/ceph/reactor/all_stages.sls

Verify that the line is not commented out.

1.10 Updating the Cluster Nodes

Keep the Ceph cluster nodes up-to-date by applying rolling updates regularly.

Important
Important: Access to Software Repositories

Before patching the cluster with latest software packages, verify that all its nodes have access to SUSE Linux Enterprise Server repositories that match your version of SUSE Enterprise Storage. For SUSE Enterprise Storage 5.5, the following repositories are required:

root # zypper lr -E
#  | Alias   | Name                              | Enabled | GPG Check | Refresh
---+---------+-----------------------------------+---------+-----------+--------
 4 | [...]   | SUSE-Enterprise-Storage-5-Pool    | Yes     | (r ) Yes  | No
 6 | [...]   | SUSE-Enterprise-Storage-5-Updates | Yes     | (r ) Yes  | Yes
 9 | [...]   | SLES12-SP3-Pool                   | Yes     | (r ) Yes  | No
11 | [...]   | SLES12-SP3-Updates                | Yes     | (r ) Yes  | Yes
Tip
Tip: Repository Staging

If you use a staging tool—for example, SUSE Manager, Subscription Management Tool, or Repository Mirroring Tool—that serves software repositories to the cluster nodes, verify that stages for both 'Updates' repositories for SUSE Linux Enterprise Server and SUSE Enterprise Storage are created at the same point in time.

We strongly recommend to use a staging tool to apply patches which have frozen or staged patch levels. This ensures that new nodes joining the cluster have the same patch level as the nodes already running in the cluster. This way you avoid the need to apply the latest patches to all the cluster's nodes before new nodes can join the cluster.

To update the software packages on all cluster nodes to the latest version, follow these steps:

  1. Update the deepsea, salt-master, and salt-minion packages and restart relevant services on the Salt master:

    root@master # salt -I 'roles:master' state.apply ceph.updates.master
  2. Update and restart the salt-minion package on all cluster nodes:

    root@master # salt -I 'cluster:ceph' state.apply ceph.updates.salt
  3. Update all other software packages on the cluster:

    root@master # salt-run state.orch ceph.stage.0
  4. Restart Ceph related services:

    root@master # salt-run state.orch ceph.restart
Note
Note: Possible Downtime of Ceph Services

When applying updates to Ceph cluster nodes, Ceph services may be restarted. If there is a single point of failure for services such as Object Gateway, NFS Ganesha, or iSCSI, the client machines may be temporarily disconnected from related services.

If DeepSea detects a running Ceph cluster, it applies available updates, restarts running Ceph services, and optionally restarts nodes sequentially if a kernel update was installed. DeepSea follows Ceph's official recommendation of first updating the monitors, then the OSDs, and lastly additional services, such as Metadata Server, Object Gateway, iSCSI Gateway, or NFS Ganesha. DeepSea stops the update process if it detects an issue in the cluster. A trigger for that can be:

  • Ceph reports 'HEALTH_ERR' for longer then 300 seconds.

  • Salt minions are queried for their assigned services to be still up and running after an update. The update fails if the services are down for more than 900 seconds.

Making these arrangements ensures that even with corrupted or failing updates, the Ceph cluster is still operational.

DeepSea Stage 0 updates the system via zypper update and optionally reboots the system if the kernel is updated. If you want to eliminate the possibility of a forced reboot of potentially all nodes, either make sure that the latest kernel is installed and running before initiating DeepSea Stage 0, or disable automatic node reboots as described in Book “Deployment Guide”, Chapter 7 “Customizing the Default Configuration”, Section 7.1.5 “Updates and Reboots during Stage 0”.

Tip
Tip: zypper patch

If you prefer to update the system using the zypper patch command, edit /srv/pillar/ceph/stack/global.yml and add the following line:

update_method_init: zypper-patch

You can change the default update/reboot behavior of DeepSea Stage 0 by adding/changing the stage_prep_master and stage_prep_minion options. For more information, see Book “Deployment Guide”, Chapter 7 “Customizing the Default Configuration”, Section 7.1.5 “Updates and Reboots during Stage 0”.

1.11 Halting or Rebooting Cluster

In some cases it may be necessary to halt or reboot the whole cluster. We recommended carefully checking for dependencies of running services. The following steps provide an outline for stopping and starting the cluster:

  1. Tell the Ceph cluster not to mark OSDs as out:

    cephadm > ceph osd set noout
  2. Stop daemons and nodes in the following order:

    1. Storage clients

    2. Gateways, for example NFS Ganesha or Object Gateway

    3. Metadata Server

    4. Ceph OSD

    5. Ceph Manager

    6. Ceph Monitor

  3. If required, perform maintenance tasks.

  4. Start the nodes and servers in the reverse order of the shutdown process:

    1. Ceph Monitor

    2. Ceph Manager

    3. Ceph OSD

    4. Metadata Server

    5. Gateways, for example NFS Ganesha or Object Gateway

    6. Storage clients

  5. Remove the noout flag:

    cephadm > ceph osd unset noout

1.12 Adjusting ceph.conf with Custom Settings

If you need to put custom settings into the ceph.conf file, you can do so by modifying the configuration files in the /srv/salt/ceph/configuration/files/ceph.conf.d directory:

  • global.conf

  • mon.conf

  • mgr.conf

  • mds.conf

  • osd.conf

  • client.conf

  • rgw.conf

Note
Note: Unique rgw.conf

The Object Gateway offers a lot flexibility and is unique compared to the other ceph.conf sections. All other Ceph components have static headers such as [mon] or [osd]. The Object Gateway has unique headers such as [client.rgw.rgw1]. This means that the rgw.conf file needs a header entry. For examples, see

/srv/salt/ceph/configuration/files/rgw.conf

or

/srv/salt/ceph/configuration/files/rgw-ssl.conf
Important
Important: Run Stage 3

After you make custom changes to the above mentioned configuration files, run Stages 3 and 4 to apply these changes to the cluster nodes:

root@master # salt-run state.orch ceph.stage.3
root@master # salt-run state.orch ceph.stage.4

These files are included from the /srv/salt/ceph/configuration/files/ceph.conf.j2 template file, and correspond to the different sections that the Ceph configuration file accepts. Putting a configuration snippet in the correct file enables DeepSea to place it into the correct section. You do not need to add any of the section headers.

Tip
Tip

To apply any configuration options only to specific instances of a daemon, add a header such as [osd.1]. The following configuration options will only be applied to the OSD daemon with the ID 1.

1.12.1 Overriding the Defaults

Later statements in a section overwrite earlier ones. Therefore it is possible to override the default configuration as specified in the /srv/salt/ceph/configuration/files/ceph.conf.j2 template. For example, to turn off cephx authentication, add the following three lines to the /srv/salt/ceph/configuration/files/ceph.conf.d/global.conf file:

auth cluster required = none
auth service required = none
auth client required = none

When redefining the default values, Ceph related tools such as rados may issue warnings that specific values from the ceph.conf.j2 were redefined in global.conf. These warnings are caused by one parameter assigned twice in the resulting ceph.conf.

As a workaround for this specific case, follow these steps:

  1. Change the current directory to /srv/salt/ceph/configuration/create:

    root@master # cd /srv/salt/ceph/configuration/create
  2. Copy default.sls to custom.sls:

    root@master # cp default.sls custom.sls
  3. Edit custom.sls and change ceph.conf.j2 to custom-ceph.conf.j2.

  4. Change current directory to /srv/salt/ceph/configuration/files:

    root@master # cd /srv/salt/ceph/configuration/files
  5. Copy ceph.conf.j2 to custom-ceph.conf.j2:

    root@master # cp ceph.conf.j2 custom-ceph.conf.j2
  6. Edit custom-ceph.conf.j2 and delete the following line:

    {% include "ceph/configuration/files/rbd.conf" %}

    Edit global.yml and add the following line:

    configuration_create: custom
  7. Refresh the pillar:

    root@master # salt target saltutil.pillar_refresh
  8. Run Stage 3:

    root@master # salt-run state.orch ceph.stage.3

Now you should have only one entry for each value definition. To re-create the configuration, run:

root@master # salt-run state.orch ceph.configuration.create

and then verify the contents of /srv/salt/ceph/configuration/cache/ceph.conf.

1.12.2 Including Configuration Files

If you need to apply a lot of custom configurations, use the following include statements within the custom configuration files to make file management easier. Following is an example of the osd.conf file:

[osd.1]
{% include "ceph/configuration/files/ceph.conf.d/osd1.conf" ignore missing %}
[osd.2]
{% include "ceph/configuration/files/ceph.conf.d/osd2.conf" ignore missing %}
[osd.3]
{% include "ceph/configuration/files/ceph.conf.d/osd3.conf" ignore missing %}
[osd.4]
{% include "ceph/configuration/files/ceph.conf.d/osd4.conf" ignore missing %}

In the previous example, the osd1.conf, osd2.conf, osd3.conf, and osd4.conf files contain the configuration options specific to the related OSD.

Tip
Tip: Runtime Configuration

Changes made to Ceph configuration files take effect after the related Ceph daemons restart. See Section 12.1, “Runtime Configuration” for more information on changing the Ceph runtime configuration.

1.13 Enabling AppArmor Profiles

AppArmor is a security solution that confines programs by a specific profile. For more details, refer to https://documentation.suse.com/sles/12-SP5/single-html/SLES-security/#part-apparmor.

DeepSea provides three states for AppArmor profiles: 'enforce', 'complain', and 'disable'. To activate a particular AppArmor state, run:

salt -I "deepsea_minions:*" state.apply ceph.apparmor.default-STATE

To put the AppArmor profiles in an 'enforce' state:

root@master # salt -I "deepsea_minions:*" state.apply ceph.apparmor.default-enforce

To put the AppArmor profiles in a 'complain' status:

root@master # salt -I "deepsea_minions:*" state.apply ceph.apparmor.default-complain

To disable the AppArmor profiles:

root@master # salt -I "deepsea_minions:*" state.apply ceph.apparmor.default-disable
Tip
Tip: Enabling the AppArmor Service

Each of these three calls verifies if AppArmor is installed and installs it if not, and starts and enables the related systemd service. DeepSea will warn you if AppArmor was installed and started/enabled in another way and therefore runs without DeepSea profiles.