22 High Availability for virtualization #
This chapter explains how to configure virtual machines as highly available cluster resources.
22.1 Overview #
Virtual machines can take different roles in a High Availability cluster:
A virtual machine can be managed by the cluster as a resource, without the cluster managing the services that run on the virtual machine. In this case, the VM is opaque to the cluster. This is the scenario described in this document.
A virtual machine can be a cluster resource and run
pacemaker_remote, which allows the cluster to manage services running on the virtual machine. In this case, the VM is a guest node and is transparent to the cluster. For this scenario, see Article “Pacemaker Remote Quick Start”, Section 4 “Use case 2: setting up a cluster with guest nodes”.A virtual machine can run a full cluster stack. In this case, the VM is a regular cluster node and is not managed by the cluster as a resource. For this scenario, see Article “Installation and Setup Quick Start”.
The following procedures describe how to set up highly available virtual machines on block storage, with another block device used as an OCFS2 volume to store the VM lock files and XML configuration files. The virtual machines and the OCFS2 volume are configured as resources managed by the cluster, with resource constraints to ensure that the lock file directory is always available before a virtual machine starts on any node. This prevents the virtual machines from starting on multiple nodes.
22.2 Requirements #
A running High Availability cluster with at least two nodes and a fencing device such as SBD.
Passwordless
rootSSH login between the cluster nodes.A network bridge on each cluster node, to be used for installing and running the VMs. This must be separate from the network used for cluster communication and management.
Two or more shared storage devices (or partitions on a single shared device), so that all cluster nodes can access the files and storage required by the VMs:
A device to use as an OCFS2 volume, which will store the VM lock files and XML configuration files. Creating and mounting the OCFS2 volume is explained in the following procedure.
A device containing the VM installation source (such as an ISO file or disk image).
Depending on the installation source, you might also need another device for the VM storage disks.
To avoid I/O starvation, these devices must be separate from the shared device used for SBD.
Stable device names for all storage paths, for example,
/dev/disk/by-id/DEVICE_ID. A shared storage device might have mismatched/dev/sdXnames on different nodes, which causes VM migration to fail.
22.3 Configuring cluster resources to manage the lock files #
Use this procedure to configure the cluster to manage the virtual machine lock files. The lock file directory must be available on all nodes so that the cluster is aware of the lock files no matter which node the VMs are running on.
You only need to run the following commands on one of the cluster nodes.
Create an OCFS2 volume on one of the shared storage devices:
#mkfs.ocfs2 /dev/disk/by-id/DEVICE_IDRun
crm configureto start thecrminteractive shell.Create a primitive resource for DLM:
crm(live)configure#primitive dlm ocf:pacemaker:controld \ op monitor interval=60 timeout=60Create a primitive resource for the OCFS2 volume:
crm(live)configure#primitive ocfs2 Filesystem \ params device="/dev/disk/by-id/DEVICE_ID" directory="/mnt/shared" fstype=ocfs2 \ op monitor interval=20 timeout=40Create a group for the DLM and OCFS2 resources:
crm(live)configure#group g-virt-lock dlm ocfs2Clone the group so that it runs on all nodes:
crm(live)configure#clone cl-virt-lock g-virt-lock \ meta interleave=trueReview your changes with
show.If everything is correct, submit your changes with
commitand leave the crm live configuration withquit.Check the status of the group clone. It should be running on all nodes:
#crm status[...] Full List of Resources: [...] * Clone Set: cl-virt-lock [g-virt-lock]: * Started: [ alice bob ]
22.4 Preparing the cluster nodes to host virtual machines #
Use this procedure to install and start the required virtualization services, and to configure the nodes to store the VM lock files on the shared OCFS2 volume.
This procedure uses crm cluster run to run commands on all
nodes at once. If you prefer to manage each node individually, you can omit the
crm cluster run portion of the commands.
Install the virtualization packages on all nodes in the cluster:
#crm cluster run "zypper install -y -t pattern kvm_server kvm_tools"On one node, find and enable the
lock_managersetting in the file/etc/libvirt/qemu.conf:lock_manager = "lockd"
On the same node, find and enable the
file_lockspace_dirsetting in the file/etc/libvirt/qemu-lockd.conf, and change the value to point to a directory on the OCFS2 volume:file_lockspace_dir = "/mnt/shared/lockd"
Copy these files to the other nodes in the cluster:
#crm cluster copy /etc/libvirt/qemu.conf#crm cluster copy /etc/libvirt/qemu-lockd.confEnable and start the
libvirtdservice on all nodes in the cluster:#crm cluster run "systemctl enable --now libvirtd"This also starts the
virtlockdservice.
22.5 Adding virtual machines as cluster resources #
Use this procedure to add virtual machines to the cluster as cluster resources, with
resource constraints to ensure the VMs can always access the lock files. The lock files are
managed by the resources in the group g-virt-lock, which is available on
all nodes via the clone cl-virt-lock.
Install your virtual machines on one of the cluster nodes, with the following restrictions:
The installation source and storage must be on shared devices.
Do not configure the VMs to start on host boot.
For more information, see Virtualization Guide for SUSE Linux Enterprise Server.
If the virtual machines are running, shut them down. The cluster will start the VMs after you add them as resources.
Dump the XML configuration to the OCFS2 volume. Repeat this step for each VM:
#virsh dumpxml VM1 > /mnt/shared/VM1.xmlMake sure the XML files do not contain any references to unshared local paths.
Run
crm configureto start thecrminteractive shell.Create primitive resources to manage the virtual machines. Repeat this step for each VM:
crm(live)configure#primitive VM1 VirtualDomain \ params config="/mnt/shared/VM1.xml" remoteuri="qemu+ssh://%n/system" \ meta allow-migrate=true \ op monitor timeout=30s interval=10sThe option
allow-migrate=trueenables live migration. If the value is set tofalse, the cluster migrates the VM by shutting it down on one node and restarting it on another node.If you need to set utilization attributes to help place VMs based on their load impact, see Section 11.10, “Placing resources based on their load impact”.
Create a colocation constraint so that the virtual machines can only start on nodes where
cl-virt-lockis running:crm(live)configure#colocation col-fs-virt inf: ( VM1 VM2 VMX ) cl-virt-lockCreate an ordering constraint so that
cl-virt-lockalways starts before the virtual machines:crm(live)configure#order o-fs-virt Mandatory: cl-virt-lock ( VM1 VM2 VMX )Review your changes with
show.If everything is correct, submit your changes with
commitand leave the crm live configuration withquit.Check the status of the virtual machines:
#crm status[...] Full List of Resources: [...] * Clone Set: cl-virt-lock [g-virt-lock]: * Started: [ alice bob ] * VM1 (ocf::heartbeat:VirtualDomain): Started alice * VM2 (ocf::heartbeat:VirtualDomain): Started alice * VMX (ocf::heartbeat:VirtualDomain): Started alice
The virtual machines are now managed by the High Availability cluster, and can migrate between the cluster nodes.
After adding virtual machines as cluster resources, do not manage them manually. Only use the cluster tools as described in Chapter 12, Managing cluster resources.
To perform maintenance tasks on cluster-managed VMs, see Section 32.2, “Different options for maintenance tasks”.
22.6 Testing the setup #
Use the following tests to confirm that the virtual machine High Availability setup works as expected.
Perform these tests in a test environment, not a production environment.
The virtual machine
VM1is running on nodealice.On node
bob, try to start the VM manually withvirsh start VM1.Expected result: The
virshcommand fails.VM1cannot be started manually onbobwhen it is running onalice.
The virtual machine
VM1is running on nodealice.Open two terminals.
In the first terminal, connect to
VM1via SSH.In the second terminal, try to migrate
VM1to nodebobwithcrm resource move VM1 bob.Run
crm_mon -rto monitor the cluster status until it stabilizes. This might take a short time.In the first terminal, check whether the SSH connection to
VM1is still active.Expected result: The cluster status shows that
VM1has started onbob. The SSH connection toVM1remains active during the whole migration.
The virtual machine
VM1is running on nodebob.Reboot
bob.On node
alice, runcrm_mon -rto monitor the cluster status until it stabilizes. This might take a short time.Expected result: The cluster status shows that
VM1has started onalice.
The virtual machine
VM1is running on nodealice.Simulate a crash on
aliceby forcing the machine off or unplugging the power cable.On node
bob, runcrm_mon -rto monitor the cluster status until it stabilizes. VM failover after a node crashes usually takes longer than VM migration after a node reboots.Expected result: After a short time, the cluster status shows that
VM1has started onbob.