7 Configuring and Managing Cluster Resources (Command Line) #
To configure and manage cluster resources, either use the CRM Shell
(crmsh) command line utility or HA Web Konsole (Hawk2), a Web-based
user interface.
This chapter introduces crm, the command line tool
and covers an overview of this tool, how to use templates, and mainly
configuring and managing cluster resources: creating basic and advanced
types of resources (groups and clones), configuring constraints,
specifying failover nodes and failback nodes, configuring resource
monitoring, starting, cleaning up or removing resources, and migrating
resources manually.
Sufficient privileges are necessary to manage a cluster. The
crm command and its subcommands need to be run either
as root user or as the CRM owner user (typically the user
hacluster).
However, the user option allows you to run
crm and its subcommands as a regular (unprivileged)
user and to change its ID using sudo whenever
necessary. For example, with the following command crm
will use hacluster as the
privileged user ID:
#crmoptions user hacluster
Note that you need to set up /etc/sudoers so that
sudo does not ask for a password.
7.1 crmsh—Overview #
The crm command has several subcommands which manage
resources, CIBs, nodes, resource agents, and others. It offers a thorough
help system with embedded examples. All examples follow a naming
convention described in
Appendix B.
By using crm without arguments (or with only one sublevel as argument), the CRM Shell enters the interactive mode. This mode is indicated by the following prompt:
crm(live/HOSTNAME)For readability reasons, we omit the host name in the interactive crm prompts in our documentation. We only include the host name if you need to run the interactive shell on a specific node, like alice for example:
crm(live/alice)7.1.1 Getting Help #
Help can be accessed in several ways:
To output the usage of
crmand its command line options:#crm--helpTo give a list of all available commands:
#crmhelpTo access other help sections, not just the command reference:
#crmhelp topicsTo view the extensive help text of the
configuresubcommand:#crmconfigure helpTo print the syntax, its usage, and examples of the
groupsubcommand ofconfigure:#crmconfigure help groupThis is the same:
#crmhelp configure group
Almost all output of the help subcommand (do not mix
it up with the --help option) opens a text viewer. This
text viewer allows you to scroll up or down and read the help text more
comfortably. To leave the text viewer, press the Q key.
The crmsh supports full tab completion in Bash directly, not only
for the interactive shell. For example, typing crm help
config→| will complete the word
like in the interactive shell.
7.1.2 Executing crmsh's Subcommands #
The crm command itself can be used in the following
ways:
Directly: Concatenate all subcommands to
crm, press Enter and you see the output immediately. For example, entercrmhelp rato get information about therasubcommand (resource agents).It is possible to abbreviate subcommands as long as they are unique. For example, you can shorten
statusasstandcrmshwill know what you have meant.Another feature is to shorten parameters. Usually, you add parameters through the
paramskeyword. You can leave out the params section if it is the first and only section. For example, this line:#crmprimitive ipaddr IPaddr2 params ip=192.168.0.55is equivalent to this line:
#crmprimitive ipaddr IPaddr2 ip=192.168.0.55As crm Shell Script: Crm shell scripts contain subcommands of
crm. For more information, see Section 7.1.4, “Usingcrmsh's Shell Scripts”.As
crmshCluster Scripts:These are a collection of metadata, references to RPM packages, configuration files, andcrmshsubcommands bundled under a single, yet descriptive name. They are managed through thecrm scriptcommand.Do not confuse them with
crmshshell scripts: although both share some common objectives, the crm shell scripts only contain subcommands whereas cluster scripts incorporate much more than a simple enumeration of commands. For more information, see Section 7.1.5, “Usingcrmsh's Cluster Scripts”.Interactive as Internal Shell: Type
crmto enter the internal shell. The prompt changes tocrm(live). Withhelpyou can get an overview of the available subcommands. As the internal shell has different levels of subcommands, you can “enter” one by typing this subcommand and press Enter.For example, if you type
resourceyou enter the resource management level. Your prompt changes tocrm(live)resource#. If you want to leave the internal shell, use the commandsquit,bye, orexit. If you need to go one level back, useback,up,end, orcd.You can enter the level directly by typing
crmand the respective subcommand(s) without any options and press Enter.The internal shell supports also tab completion for subcommands and resources. Type the beginning of a command, press →| and
crmcompletes the respective object.
In addition to previously explained methods, crmsh also supports
synchronous command execution. Use the -w option to
activate it. If you have started crm without
-w, you can enable it later with the user preference's
wait set to yes (options
wait yes). If this option is enabled, crm
waits until the transition is finished. Whenever a transaction is
started, dots are printed to indicate progress. Synchronous command
execution is only applicable for commands like resource
start.
The crm tool has management capability (the
subcommands resource and node)
and can be used for configuration (cib,
configure).
The following subsections give you an overview of some important aspects
of the crm tool.
7.1.3 Displaying Information about OCF Resource Agents #
As you need to deal with resource agents in your cluster configuration
all the time, the crm tool contains the
ra command. Use it to show information about resource
agents and to manage them (for additional information, see also
Section 5.3.2, “Supported Resource Agent Classes”):
#crmracrm(live)ra#
The command classes lists all classes and providers:
crm(live)ra#classeslsb ocf / heartbeat linbit lvm2 ocfs2 pacemaker service stonith systemd
To get an overview of all available resource agents for a class (and
provider) use the list command:
crm(live)ra#listocf AoEtarget AudibleAlarm CTDB ClusterMon Delay Dummy EvmsSCC Evmsd Filesystem HealthCPU HealthSMART ICP IPaddr IPaddr2 IPsrcaddr IPv6addr LVM LinuxSCSI MailTo ManageRAID ManageVE Pure-FTPd Raid1 Route SAPDatabase SAPInstance SendArp ServeRAID ...
An overview of a resource agent can be viewed with
info:
crm(live)ra#infoocf:linbit:drbd This resource agent manages a DRBD* resource as a master/slave resource. DRBD is a shared-nothing replicated storage device. (ocf:linbit:drbd) Master/Slave OCF Resource Agent for DRBD Parameters (* denotes required, [] the default): drbd_resource* (string): drbd resource name The name of the drbd resource from the drbd.conf file. drbdconf (string, [/etc/drbd.conf]): Path to drbd.conf Full path to the drbd.conf file. Operations' defaults (advisory minimum): start timeout=240 promote timeout=90 demote timeout=90 notify timeout=90 stop timeout=100 monitor_Slave_0 interval=20 timeout=20 start-delay=1m monitor_Master_0 interval=10 timeout=20 start-delay=1m
Leave the viewer by pressing Q.
crm Directly
In the former example we used the internal shell of the
crm command. However, you do not necessarily need to
use it. You get the same results if you add the respective subcommands
to crm. For example, you can list all the OCF
resource agents by entering crm ra list
ocf in your shell.
7.1.4 Using crmsh's Shell Scripts #
The crmsh shell scripts provide a convenient way to enumerate crmsh
subcommands into a file. This makes it easy to comment specific lines or
to replay them later. Keep in mind that a crmsh shell script can contain
only crmsh subcommands. Any other commands are not
allowed.
Before you can use a crmsh shell script, create a file with specific
commands. For example, the following file prints the status of the cluster
and gives a list of all nodes:
crmsh Shell Script ## A small example file with some crm subcommandsstatusnodelist
Any line starting with the hash symbol (#) is a
comment and is ignored. If a line is too long, insert a backslash
(\) at the end and continue in the next line. It is
recommended to indent lines that belong to a certain subcommand to improve
readability.
To use this script, use one of the following methods:
#crm-f example.cli#crm< example.cli
7.1.5 Using crmsh's Cluster Scripts #
Collecting information from all cluster nodes and deploying any
changes is a key cluster administration task. Instead of performing
the same procedures manually on different nodes (which is error-prone),
you can use the crmsh cluster scripts.
Do not confuse them with the crmsh shell scripts,
which are explained in Section 7.1.4, “Using crmsh's Shell Scripts”.
In contrast to crmsh shell scripts, cluster scripts performs
additional tasks like:
Installing software that is required for a specific task.
Creating or modifying any configuration files.
Collecting information and reporting potential problems with the cluster.
Deploying the changes to all nodes.
crmsh cluster scripts do not replace other tools for managing
clusters—they provide an integrated way to perform the above
tasks across the cluster. Find detailed information at http://crmsh.github.io/scripts/.
7.1.5.1 Usage #
To get a list of all available cluster scripts, run:
#crmscript list
To view the components of a script, use the
show command and the name of the cluster script,
for example:
#crmscript show mailto mailto (Basic) MailTo This is a resource agent for MailTo. It sends email to a sysadmin whenever a takeover occurs. 1. Notifies recipients by email in the event of resource takeover id (required) (unique) Identifier for the cluster resource email (required) Email address subject Subject
The output of show contains a title, a
short description, and a procedure. Each procedure is divided
into a series of steps, performed in the given order.
Each step contains a list of required and optional parameters, along with a short description and its default value.
Each cluster script understands a set of common parameters. These parameters can be passed to any script:
| Parameter | Argument | Description |
|---|---|---|
| action | INDEX | If set, only execute a single action (index, as returned by verify) |
| dry_run | BOOL | If set, simulate execution only (default: no) |
| nodes | LIST | List of nodes to execute the script for |
| port | NUMBER | Port to connect to |
| statefile | FILE | When single-stepping, the state is saved in the given file |
| sudo | BOOL | If set, crm will prompt for a sudo password and use sudo where appropriate (default: no) |
| timeout | NUMBER | Execution timeout in seconds (default: 600) |
| user | USER | Run script as the given user |
7.1.5.2 Verifying and Running a Cluster Script #
Before running a cluster script, review the actions that it will perform and verify its parameters to avoid problems. A cluster script can potentially perform a series of actions and may fail for various reasons. Thus, verifying your parameters before running it helps to avoid problems.
For example, the mailto resource agent
requires a unique identifier and an e-mail address. To verify these
parameters, run:
#crmscript verify mailto id=sysadmin email=tux@example.org 1. Ensure mail package is installed mailx 2. Configure cluster resources primitive sysadmin MailTo email="tux@example.org" op start timeout="10" op stop timeout="10" op monitor interval="10" timeout="10" clone c-sysadmin sysadmin
The verify prints the steps and replaces
any placeholders with your given parameters. If verify
finds any problems, it will report it.
If everything is ok, replace the verify
command with run:
#crmscript run mailto id=sysadmin email=tux@example.org INFO: MailTo INFO: Nodes: alice, bob OK: Ensure mail package is installed OK: Configure cluster resources
Check whether your resource is integrated into your cluster
with crm status:
#crmstatus [...] Clone Set: c-sysadmin [sysadmin] Started: [ alice bob ]
7.1.6 Using Configuration Templates #
The use of configuration templates is deprecated and will
be removed in the future. Configuration templates will be replaced
by cluster scripts, see Section 7.1.5, “Using crmsh's Cluster Scripts”.
Configuration templates are ready-made cluster configurations for
crmsh. Do not confuse them with the resource
templates (as described in
Section 7.4.3, “Creating Resource Templates”). Those are
templates for the cluster and not for the crm
shell.
Configuration templates require minimum effort to be tailored to the particular user's needs. Whenever a template creates a configuration, warning messages give hints which can be edited later for further customization.
The following procedure shows how to create a simple yet functional Apache configuration:
Log in as
rootand start thecrminteractive shell:#crmconfigureCreate a new configuration from a configuration template:
Switch to the
templatesubcommand:crm(live)configure#templateList the available configuration templates:
crm(live)configure template#listtemplates gfs2-base filesystem virtual-ip apache clvm ocfs2 gfs2Decide which configuration template you need. As we need an Apache configuration, we select the
apachetemplate and name itg-intranet:crm(live)configure template#newg-intranet apache INFO: pulling in template apache INFO: pulling in template virtual-ip
Define your parameters:
List the configuration you have created:
crm(live)configure template#listg-intranetDisplay the minimum required changes that need to be filled out by you:
crm(live)configure template#showERROR: 23: required parameter ip not set ERROR: 61: required parameter id not set ERROR: 65: required parameter configfile not setInvoke your preferred text editor and fill out all lines that have been displayed as errors in Step 3.b:
crm(live)configure template#edit
Show the configuration and check whether it is valid (bold text depends on the configuration you have entered in Step 3.c):
crm(live)configure template#showprimitive virtual-ip ocf:heartbeat:IPaddr \ params ip="192.168.1.101" primitive apache apache \ params configfile="/etc/apache2/httpd.conf" monitor apache 120s:60s group g-intranet \ apache virtual-ipApply the configuration:
crm(live)configure template#applycrm(live)configure#cd ..crm(live)configure#showSubmit your changes to the CIB:
crm(live)configure#commit
It is possible to simplify the commands even more, if you know the details. The above procedure can be summarized with the following command on the shell:
#crmconfigure template \ new g-intranet apache params \ configfile="/etc/apache2/httpd.conf" ip="192.168.1.101"
If you are inside your internal crm shell, use the
following command:
crm(live)configure template#newintranet apache params \ configfile="/etc/apache2/httpd.conf" ip="192.168.1.101"
However, the previous command only creates its configuration from the configuration template. It does not apply nor commit it to the CIB.
7.1.7 Testing with Shadow Configuration #
A shadow configuration is used to test different configuration scenarios. If you have created several shadow configurations, you can test them one by one to see the effects of your changes.
The usual process looks like this:
Log in as
rootand start thecrminteractive shell:#crmconfigureCreate a new shadow configuration:
crm(live)configure#cibnew myNewConfig INFO: myNewConfig shadow CIB createdIf you omit the name of the shadow CIB, a temporary name
@tmp@is created.If you want to copy the current live configuration into your shadow configuration, use the following command, otherwise skip this step:
crm(myNewConfig)#
cibreset myNewConfigThe previous command makes it easier to modify any existing resources later.
Make your changes as usual. After you have created the shadow configuration, all changes go there. To save all your changes, use the following command:
crm(myNewConfig)#
commitIf you need the live cluster configuration again, switch back with the following command:
crm(myNewConfig)configure#
cibuse livecrm(live)#
7.1.8 Debugging Your Configuration Changes #
Before loading your configuration changes back into the cluster, it is
recommended to review your changes with ptest. The
ptest command can show a diagram of actions that will
be induced by committing the changes. You need the
graphviz package to display the diagrams. The
following example is a transcript, adding a monitor operation:
#crmconfigurecrm(live)configure#showfence-bob primitive fence-bob stonith:apcsmart \ params hostlist="bob"crm(live)configure#monitorfence-bob 120m:60scrm(live)configure#showchanged primitive fence-bob stonith:apcsmart \ params hostlist="bob" \ op monitor interval="120m" timeout="60s"crm(live)configure#ptestcrm(live)configure#commit
7.1.9 Cluster Diagram #
To output a cluster diagram, use the command
crm configure graph. It displays
the current configuration on its current window, therefore requiring
X11.
If you prefer Scalable Vector Graphics (SVG), use the following command:
#crmconfigure graph dot config.svg svg
7.2 Managing Corosync Configuration #
Corosync is the underlying messaging layer for most HA clusters. The
corosync subcommand provides commands for editing and
managing the Corosync configuration.
For example, to list the status of the cluster, use
status:
#crmcorosync status Printing ring status. Local node ID 175704363 RING ID 0 id = 10.121.9.43 status = ring 0 active with no faults Quorum information ------------------ Date: Thu May 8 16:41:56 2014 Quorum provider: corosync_votequorum Nodes: 2 Node ID: 175704363 Ring ID: 4032 Quorate: Yes Votequorum information ---------------------- Expected votes: 2 Highest expected: 2 Total votes: 2 Quorum: 2 Flags: Quorate Membership information ---------------------- Nodeid Votes Name 175704363 1 alice.example.com (local) 175704619 1 bob.example.com
The diff command is very helpful: It compares the
Corosync configuration on all nodes (if not stated otherwise) and
prints the difference between:
#crmcorosync diff --- bob +++ alice @@ -46,2 +46,2 @@ - expected_votes: 2 - two_node: 1 + expected_votes: 1 + two_node: 0
For more details, see http://crmsh.nongnu.org/crm.8.html#cmdhelp_corosync.
7.3 Configuring Global Cluster Options #
Global cluster options control how the cluster behaves when confronted with certain situations. The predefined values can usually be kept. However, to make key functions of your cluster work correctly, you need to adjust the following parameters after basic cluster setup:
crm #Log in as
rootand start thecrmtool:#crmconfigureUse the following commands to set the options for two-node clusters only:
crm(live)configure#propertyno-quorum-policy=stopcrm(live)configure#propertystonith-enabled=trueImportant: No Support Without STONITHA cluster without STONITH is not supported.
Show your changes:
crm(live)configure#showproperty $id="cib-bootstrap-options" \ dc-version="1.1.1-530add2a3721a0ecccb24660a97dbfdaa3e68f51" \ cluster-infrastructure="corosync" \ expected-quorum-votes="2" \ no-quorum-policy="stop" \ stonith-enabled="true"Commit your changes and exit:
crm(live)configure#commitcrm(live)configure#exit
7.4 Configuring Cluster Resources #
As a cluster administrator, you need to create cluster resources for every resource or application you run on servers in your cluster. Cluster resources can include Web sites, e-mail servers, databases, file systems, virtual machines, and any other server-based applications or services you want to make available to users at all times.
For an overview of resource types you can create, refer to Section 5.3.3, “Types of Resources”.
7.4.1 Loading Cluster Resources from a File #
Parts or all of the configuration can be loaded from a local file or a network URL. Three different methods can be defined:
replaceThis option replaces the current configuration with the new source configuration.
updateThis option tries to import the source configuration. It adds new items or updates existing items to the current configuration.
pushThis option imports the content from the source into the current configuration (same as
update). However, it removes objects that are not available in the new configuration.
To load the new configuration from the file mycluster-config.txt
use the following syntax:
#crmconfigure load push mycluster-config.txt
7.4.2 Creating Cluster Resources #
There are three types of RAs (Resource Agents) available with the cluster (for background information, see Section 5.3.2, “Supported Resource Agent Classes”). To add a new resource to the cluster, proceed as follows:
Log in as
rootand start thecrmtool:#crmconfigureConfigure a primitive IP address:
crm(live)configure#primitivemyIP IPaddr \ params ip=127.0.0.99 op monitor interval=60sThe previous command configures a “primitive” with the name
myIP. You need to choose a class (hereocf), provider (heartbeat), and type (IPaddr). Furthermore, this primitive expects other parameters like the IP address. Change the address to your setup.Display and review the changes you have made:
crm(live)configure#showCommit your changes to take effect:
crm(live)configure#commit
7.4.3 Creating Resource Templates #
If you want to create several resources with similar configurations, a
resource template simplifies the task. See also
Section 5.5.3, “Resource Templates and Constraints” for some
basic background information. Do not confuse them with the
“normal” templates from
Section 7.1.6, “Using Configuration Templates”. Use the
rsc_template command to get familiar with the syntax:
#crmconfigure rsc_template usage: rsc_template <name> [<class>:[<provider>:]]<type> [params <param>=<value> [<param>=<value>...]] [meta <attribute>=<value> [<attribute>=<value>...]] [utilization <attribute>=<value> [<attribute>=<value>...]] [operations id_spec [op op_type [<attribute>=<value>...] ...]]
For example, the following command creates a new resource template with
the name BigVM derived from the
ocf:heartbeat:Xen resource and some default values
and operations:
crm(live)configure#rsc_templateBigVM ocf:heartbeat:Xen \ params allow_mem_management="true" \ op monitor timeout=60s interval=15s \ op stop timeout=10m \ op start timeout=10m
Once you defined the new resource template, you can use it in primitives
or reference it in order, colocation, or rsc_ticket constraints. To
reference the resource template, use the @ sign:
crm(live)configure#primitiveMyVM1 @BigVM \ params xmfile="/etc/xen/shared-vm/MyVM1" name="MyVM1"
The new primitive MyVM1 is going to inherit everything from the BigVM resource templates. For example, the equivalent of the above two would be:
crm(live)configure#primitiveMyVM1 Xen \ params xmfile="/etc/xen/shared-vm/MyVM1" name="MyVM1" \ params allow_mem_management="true" \ op monitor timeout=60s interval=15s \ op stop timeout=10m \ op start timeout=10m
If you want to overwrite some options or operations, add them to your (primitive) definition. For example, the following new primitive MyVM2 doubles the timeout for monitor operations but leaves others untouched:
crm(live)configure#primitiveMyVM2 @BigVM \ params xmfile="/etc/xen/shared-vm/MyVM2" name="MyVM2" \ op monitor timeout=120s interval=30s
A resource template may be referenced in constraints to stand for all primitives which are derived from that template. This helps to produce a more concise and clear cluster configuration. Resource template references are allowed in all constraints except location constraints. Colocation constraints may not contain more than one template reference.
7.4.4 Creating a STONITH Resource #
From the crm perspective, a STONITH device is
just another resource. To create a STONITH resource, proceed as
follows:
Log in as
rootand start thecrminteractive shell:#crmconfigureGet a list of all STONITH types with the following command:
crm(live)#ralist stonith apcmaster apcmastersnmp apcsmart baytech bladehpi cyclades drac3 external/drac5 external/dracmc-telnet external/hetzner external/hmchttp external/ibmrsa external/ibmrsa-telnet external/ipmi external/ippower9258 external/kdumpcheck external/libvirt external/nut external/rackpdu external/riloe external/sbd external/vcenter external/vmware external/xen0 external/xen0-ha fence_legacy ibmhmc ipmilan meatware nw_rpc100s rcd_serial rps10 suicide wti_mpc wti_npsChoose a STONITH type from the above list and view the list of possible options. Use the following command:
crm(live)#rainfo stonith:external/ipmi IPMI STONITH external device (stonith:external/ipmi) ipmitool based power management. Apparently, the power off method of ipmitool is intercepted by ACPI which then makes a regular shutdown. If case of a split brain on a two-node it may happen that no node survives. For two-node clusters use only the reset method. Parameters (* denotes required, [] the default): hostname (string): Hostname The name of the host to be managed by this STONITH device. ...Create the STONITH resource with the
stonithclass, the type you have chosen in Step 3, and the respective parameters if needed, for example:crm(live)#configurecrm(live)configure#primitivemy-stonith stonith:external/ipmi \ params hostname="alice" \ ipaddr="192.168.1.221" \ userid="admin" passwd="secret" \ op monitor interval=60m timeout=120s
7.4.5 Configuring Resource Constraints #
Having all the resources configured is only one part of the job. Even if the cluster knows all needed resources, it might still not be able to handle them correctly. For example, try not to mount the file system on the slave node of DRBD (in fact, this would fail with DRBD). Define constraints to make these kind of information available to the cluster.
For more information about constraints, see Section 5.5, “Resource Constraints”.
7.4.5.1 Locational Constraints #
The location command defines on which nodes a
resource may be run, may not be run or is preferred to be run.
This type of constraint may be added multiple times for each resource.
All location constraints are evaluated for a given
resource. A simple example that expresses a preference to run the
resource fs1 on the node with the name
alice to 100 would be the
following:
crm(live)configure#locationloc-fs1 fs1 100: alice
Another example is a location with ping:
crm(live)configure#primitiveping ping \ params name=ping dampen=5s multiplier=100 host_list="r1 r2"crm(live)configure#clonecl-ping ping meta interleave=truecrm(live)configure#locationloc-node_pref internal_www \ rule 50: #uname eq alice \ rule ping: defined ping
The parameter host_list is a space-separated list
of hosts to ping and count.
Another use case for location constraints are grouping primitives as a
resource set. This can be useful if several
resources depend on, for example, a ping attribute for network
connectivity. In former times, the -inf/ping rules
needed to be duplicated several times in the configuration, making it
unnecessarily complex.
The following example creates a resource set
loc-alice, referencing the virtual IP addresses
vip1 and vip2:
crm(live)configure#primitivevip1 IPaddr2 params ip=192.168.1.5crm(live)configure#primitivevip2 IPaddr2 params ip=192.168.1.6crm(live)configure#locationloc-alice { vip1 vip2 } inf: alice
In some cases it is much more efficient and convenient to use resource
patterns for your location command. A resource
pattern is a regular expression between two slashes. For example, the
above virtual IP addresses can be all matched with the following:
crm(live)configure#locationloc-alice /vip.*/ inf: alice
7.4.5.2 Colocational Constraints #
The colocation command is used to define what
resources should run on the same or on different hosts.
It is only possible to set a score of either +inf or -inf, defining resources that must always or must never run on the same node. It is also possible to use non-infinite scores. In that case the colocation is called advisory and the cluster may decide not to follow them in favor of not stopping other resources if there is a conflict.
For example, to run the resources with the IDs
filesystem_resource and nfs_group
always on the same host, use the following constraint:
crm(live)configure#colocationnfs_on_filesystem inf: nfs_group filesystem_resource
For a master slave configuration, it is necessary to know if the current node is a master in addition to running the resource locally.
7.4.5.3 Collocating Sets for Resources Without Dependency #
Sometimes it is useful to be able to place a group of resources on the same node (defining a colocation constraint), but without having hard dependencies between the resources.
Use the command weak-bond if you want to place
resources on the same node, but without any action if one of them
fails.
#crmconfigure assist weak-bond RES1 RES2
The implementation of weak-bond creates a dummy
resource and a colocation constraint with the given resources
automatically.
7.4.5.4 Ordering Constraints #
The order command defines a sequence of action.
Sometimes it is necessary to provide an order of resource actions or operations. For example, you cannot mount a file system before the device is available to a system. Ordering constraints can be used to start or stop a service right before or after a different resource meets a special condition, such as being started, stopped, or promoted to master.
Use the following command in the crm shell to
configure an ordering constraint:
crm(live)configure#ordernfs_after_filesystem mandatory: filesystem_resource nfs_group
7.4.5.5 Constraints for the Example Configuration #
The example used for this section would not work without additional constraints. It is essential that all resources run on the same machine as the master of the DRBD resource. The DRBD resource must be master before any other resource starts. Trying to mount the DRBD device when it is not the master simply fails. The following constraints must be fulfilled:
The file system must always be on the same node as the master of the DRBD resource.
crm(live)configure#colocationfilesystem_on_master inf: \ filesystem_resource drbd_resource:MasterThe NFS server and the IP address must be on the same node as the file system.
crm(live)configure#colocationnfs_with_fs inf: \ nfs_group filesystem_resourceThe NFS server and the IP address start after the file system is mounted:
crm(live)configure#ordernfs_second mandatory: \ filesystem_resource:start nfs_groupThe file system must be mounted on a node after the DRBD resource is promoted to master on this node.
crm(live)configure#orderdrbd_first inf: \ drbd_resource:promote filesystem_resource:start
7.4.6 Specifying Resource Failover Nodes #
To determine a resource failover, use the meta attribute migration-threshold. In case failcount exceeds migration-threshold on all nodes, the resource will remain stopped. For example:
crm(live)configure#locationrsc1-alice rsc1 100: alice
Normally, rsc1 prefers to run on alice. If it fails there, migration-threshold is checked and compared to the failcount. If failcount >= migration-threshold then it is migrated to the node with the next best preference.
Start failures set the failcount to inf depend on the
start-failure-is-fatal option. Stop failures cause
fencing. If there is no STONITH defined, the resource will not migrate.
For an overview, refer to Section 5.5.4, “Failover Nodes”.
7.4.7 Specifying Resource Failback Nodes (Resource Stickiness) #
A resource might fail back to its original node when that node is back online and in the cluster. To prevent a resource from failing back to the node that it was running on, or to specify a different node for the resource to fail back to, change its resource stickiness value. You can either specify resource stickiness when you are creating a resource or afterward.
For an overview, refer to Section 5.5.5, “Failback Nodes”.
7.4.8 Configuring Placement of Resources Based on Load Impact #
Some resources may have specific capacity requirements such as minimum amount of memory. Otherwise, they may fail to start completely or run with degraded performance.
To take this into account, SUSE Linux Enterprise High Availability allows you to specify the following parameters:
The capacity a certain node provides.
The capacity a certain resource requires.
An overall strategy for placement of resources.
For detailed background information about the parameters and a configuration example, refer to Section 5.5.6, “Placing Resources Based on Their Load Impact”.
To configure the resource's requirements and the capacity a node
provides, use utilization attributes.
You can name the utilization attributes according to your preferences
and define as many name/value pairs as your configuration needs. In
certain cases, some agents update the utilization themselves, for
example the VirtualDomain.
In the following example, we assume that you already have a basic configuration of cluster nodes and resources. You now additionally want to configure the capacities a certain node provides and the capacity a certain resource requires.
crm #Log in as
rootand start thecrminteractive shell:#crmconfigureTo specify the capacity a node provides, use the following command and replace the placeholder NODE_1 with the name of your node:
crm(live)configure#nodeNODE_1 utilization hv_memory=16384 cpu=8With these values, NODE_1 would be assumed to provide 16GB of memory and 8 CPU cores to resources.
To specify the capacity a resource requires, use:
crm(live)configure#primitivexen1 Xen ... \ utilization hv_memory=4096 cpu=4This would make the resource consume 4096 of those memory units from NODE_1, and 4 of the CPU units.
Configure the placement strategy with the
propertycommand:crm(live)configure#property...The following values are available:
default(default value)Utilization values are not considered. Resources are allocated according to location scoring. If scores are equal, resources are evenly distributed across nodes.
utilizationUtilization values are considered when deciding if a node has enough free capacity to satisfy a resource's requirements. However, load-balancing is still done based on the number of resources allocated to a node.
minimalUtilization values are considered when deciding if a node has enough free capacity to satisfy a resource's requirements. An attempt is made to concentrate the resources on as few nodes as possible (to achieve power savings on the remaining nodes).
balancedUtilization values are considered when deciding if a node has enough free capacity to satisfy a resource's requirements. An attempt is made to distribute the resources evenly, thus optimizing resource performance.
Note: Configuring Resource PrioritiesThe available placement strategies are best-effort—they do not yet use complex heuristic solvers to always reach optimum allocation results. Ensure that resource priorities are properly set so that your most important resources are scheduled first.
Commit your changes before leaving
crmsh:crm(live)configure#commit
The following example demonstrates a three node cluster of equal nodes, with 4 virtual machines:
crm(live)configure#nodealice utilization hv_memory="4000"crm(live)configure#nodebob utilization hv_memory="4000"crm(live)configure#nodecharlie utilization hv_memory="4000"crm(live)configure#primitivexenA Xen \ utilization hv_memory="3500" meta priority="10" \ params xmfile="/etc/xen/shared-vm/vm1"crm(live)configure#primitivexenB Xen \ utilization hv_memory="2000" meta priority="1" \ params xmfile="/etc/xen/shared-vm/vm2"crm(live)configure#primitivexenC Xen \ utilization hv_memory="2000" meta priority="1" \ params xmfile="/etc/xen/shared-vm/vm3"crm(live)configure#primitivexenD Xen \ utilization hv_memory="1000" meta priority="5" \ params xmfile="/etc/xen/shared-vm/vm4"crm(live)configure#propertyplacement-strategy="minimal"
With all three nodes up, xenA will be placed onto a node first, followed by xenD. xenB and xenC would either be allocated together or one of them with xenD.
If one node failed, too little total memory would be available to host them all. xenA would be ensured to be allocated, as would xenD. However, only one of xenB or xenC could still be placed, and since their priority is equal, the result is not defined yet. To resolve this ambiguity as well, you would need to set a higher priority for either one.
7.4.9 Configuring Resource Monitoring #
To monitor a resource, there are two possibilities: either define a
monitor operation with the op keyword or use the
monitor command. The following example configures an
Apache resource and monitors it every 60 seconds with the
op keyword:
crm(live)configure#primitiveapache apache \ params ... \ op monitor interval=60s timeout=30s
The same can be done with:
crm(live)configure#primitiveapache apache \ params ...crm(live)configure#monitorapache 60s:30s
For an overview, refer to Section 5.4, “Resource Monitoring”.
7.4.10 Configuring a Cluster Resource Group #
One of the most common elements of a cluster is a set of resources that needs to be located together. Start sequentially and stop in the reverse order. To simplify this configuration we support the concept of groups. The following example creates two primitives (an IP address and an e-mail resource):
Run the
crmcommand as system administrator. The prompt changes tocrm(live).Configure the primitives:
crm(live)#configurecrm(live)configure#primitivePublic-IP ocf:heartbeat:IPaddr2 \ params ip=1.2.3.4 \ op monitor interval=10scrm(live)configure#primitiveEmail systemd:postfix \ op monitor interval=10sGroup the primitives with their relevant identifiers in the correct order:
crm(live)configure#groupg-mailsvc Public-IP Email
To change the order of a group member, use the
modgroup command from the
configure subcommand. Use the following commands to
move the primitive Email before
Public-IP. (This is just to demonstrate the feature):
crm(live)configure#modgroupg-mailsvc add Email before Public-IP
To remove a resource from a group (for example,
Email), use this command:
crm(live)configure#modgroupg-mailsvc remove Email
For an overview, refer to Section 5.3.5.1, “Groups”.
7.4.11 Configuring a Clone Resource #
Clones were initially conceived as a convenient way to start N instances of an IP resource and have them distributed throughout the cluster for load balancing. They have turned out to be useful for several other purposes, including integrating with DLM, the fencing subsystem and OCFS2. You can clone any resource, provided the resource agent supports it.
Learn more about cloned resources in Section 5.3.5.2, “Clones”.
7.4.11.1 Creating Anonymous Clone Resources #
To create an anonymous clone resource, first create a primitive
resource and then refer to it with the clone
command. Do the following:
Log in as
rootand start thecrminteractive shell:#crmconfigureConfigure the primitive, for example:
crm(live)configure#primitiveApache apacheClone the primitive:
crm(live)configure#clonecl-apache Apache
7.4.11.2 Creating Stateful/Multi-State Clone Resources #
Multi-state resources are a specialization of clones. This type allows the instances to be in one of two operating modes, be it active/passive, primary/secondary, or master/slave.
To create a stateful clone resource, first create a primitive resource and then the multi-state resource. The multi-state resource must support at least promote and demote operations.
Log in as
rootand start thecrminteractive shell:#crmconfigureConfigure the primitive. Change the intervals if needed:
crm(live)configure#primitivemy-rsc ocf:myCorp:myAppl \ op monitor interval=60 \ op monitor interval=61 role=MasterCreate the multi-state resource:
crm(live)configure#msms-rsc my-rsc
7.5 Managing Cluster Resources #
Apart from the possibility to configure your cluster resources, the
crm tool also allows you to manage existing resources.
The following subsections gives you an overview.
7.5.1 Showing Cluster Resources #
When administering a cluster the command crm configure show
lists the current CIB objects like cluster configuration, global options,
primitives, and others:
#crmconfigure show node 178326192: alice node 178326448: bob primitive admin_addr IPaddr2 \ params ip=192.168.2.1 \ op monitor interval=10 timeout=20 primitive stonith-sbd stonith:external/sbd \ params pcmk_delay_max=30 property cib-bootstrap-options: \ have-watchdog=true \ dc-version=1.1.15-17.1-e174ec8 \ cluster-infrastructure=corosync \ cluster-name=hacluster \ stonith-enabled=true \ placement-strategy=balanced \ standby-mode=true rsc_defaults rsc-options: \ resource-stickiness=1 \ migration-threshold=3 op_defaults op-options: \ timeout=600 \ record-pending=true
In case you have lots of resources, the output of show
is too verbose. To restrict the output, use the name of the resource.
For example, to list the properties of the primitive
admin_addr only, append the resource name to
show:
#crmconfigure show admin_addr primitive admin_addr IPaddr2 \ params ip=192.168.2.1 \ op monitor interval=10 timeout=20
However, in some cases, you want to limit the output of specific resources
even more. This can be achieved with filters. Filters
limit the output to specific components. For example, to list the
nodes only, use type:node:
#crmconfigure show type:node node 178326192: alice node 178326448: bob
In case you are also interested in primitives, use the
or operator:
#crmconfigure show type:node or type:primitive node 178326192: alice node 178326448: bob primitive admin_addr IPaddr2 \ params ip=192.168.2.1 \ op monitor interval=10 timeout=20 primitive stonith-sbd stonith:external/sbd \ params pcmk_delay_max=30
Furthermore, to search for an object that starts with a certain string, use this notation:
#crmconfigure show type:primitive and and 'admin*' primitive admin_addr IPaddr2 \ params ip=192.168.2.1 \ op monitor interval=10 timeout=20
To list all available types, enter crm configure show type:
and press the →| key. The Bash completion will give
you a list of all types.
7.5.2 Starting a New Cluster Resource #
To start a new cluster resource you need the respective identifier. Proceed as follows:
Log in as
rootand start thecrminteractive shell:#crmSwitch to the resource level:
crm(live)#resourceStart the resource with
startand press the →| key to show all known resources:crm(live)resource#startID
7.5.3 Stopping a Cluster Resource #
To stop one or more existing cluster resources you need the respective identifier(s). Proceed as follows:
Log in as
rootand start thecrminteractive shell:#crmSwitch to the resource level:
crm(live)#resourceStop the resource with
stopand press the →| key to show all known resources:crm(live)resource#stopIDIt's possible to stop multiple resources at once:
crm(live)resource#stopID1 ID2 ...
7.5.4 Cleaning Up Resources #
A resource will be automatically restarted if it fails, but each failure
raises the resource's failcount. If a
migration-threshold has been set for that resource,
the node will no longer be allowed to run the resource when the
number of failures has reached the migration threshold.
Open a shell and log in as user
root.Get a list of all your resources:
#crmresource list ... Resource Group: dlm-clvm:1 dlm:1 (ocf:pacemaker:controld) Started clvm:1 (ocf:heartbeat:clvm) StartedTo clean up the resource
dlm, for example:#crmresource cleanup dlm
7.5.5 Removing a Cluster Resource #
Proceed as follows to remove a cluster resource:
Log in as
rootand start thecrminteractive shell:#crmconfigureRun the following command to get a list of your resources:
crm(live)#resourcestatusFor example, the output can look like this (whereas myIP is the relevant identifier of your resource):
myIP (ocf:IPaddr:heartbeat) ...
Delete the resource with the relevant identifier (which implies a
committoo):crm(live)#configuredelete YOUR_IDCommit the changes:
crm(live)#configurecommit
7.5.6 Migrating a Cluster Resource #
Although resources are configured to automatically fail over (or migrate) to other nodes of the cluster if a hardware or software failure occurs, you can also manually move a resource to another node using either Hawk2 or the command line.
Use the migrate command for this task. For example,
to migrate the resource ipaddress1 to a cluster node
named bob, use these
commands:
#crmresourcecrm(live)resource#migrateipaddress1 bob
7.5.7 Grouping/Tagging Resources #
Tags are a way to refer to multiple resources at once, without creating
any colocation or ordering relationship between them. This can be useful
for grouping conceptually related resources. For example, if you have
several resources related to a database, create a tag called
databases and add all resources related to the
database to this tag:
#crmconfigure tag databases: db1 db2 db3
This allows you to start them all with a single command:
#crmresource start databases
Similarly, you can stop them all too:
#crmresource stop databases
7.5.8 Getting Health Status #
The “health” status of a cluster or node can be displayed with so called scripts. A script can perform different tasks—they are not targeted to health. However, for this subsection, we focus on how to get the health status.
To get all the details about the health command, use
describe:
#crmscript describe health
It shows a description and a list of all parameters and their default
values. To execute a script, use run:
#crmscript run health
If you prefer to run only one step from the suite, the
describe command lists all available steps in the
Steps category.
For example, the following command executes the first step of the
health command. The output is stored in the
health.json file for further investigation:
#crmscript run health statefile='health.json'
It is also possible to run the above commands with
crm cluster health.
For additional information regarding scripts, see http://crmsh.github.io/scripts/.
7.6 Setting Passwords Independent of cib.xml #
In case your cluster configuration contains sensitive information, such as passwords, it should be stored in local files. That way, these parameters will never be logged or leaked in support reports.
Before using secret, better run the
show command first to get an overview of all your
resources:
#crmconfigure show primitive mydb mysql \ params replication_user=admin ...
If you want to set a password for the above mydb
resource, use the following commands:
#crmresource secret mydb set passwd linux INFO: syncing /var/lib/heartbeat/lrm/secrets/mydb/passwd to [your node list]
You can get the saved password back with:
#crmresource secret mydb show passwd linux
Note that the parameters need to be synchronized between nodes; the
crm resource secret command will take care of that. We
highly recommend to only use this command to manage secret parameters.
7.7 Retrieving History Information #
Investigating the cluster history is a complex task. To simplify this
task, crmsh contains the history command with its
subcommands. It is assumed SSH is configured correctly.
Each cluster moves states, migrates resources, or starts important
processes. All these actions can be retrieved by subcommands of
history.
By default, all history commands look at the events of
the last hour. To change this time frame, use the
limit subcommand. The syntax is:
#crmhistorycrm(live)history#limitFROM_TIME [TO_TIME]
Some valid examples include:
limit4:00pm,limit16:00Both commands mean the same, today at 4pm.
limit2012/01/12 6pmJanuary 12th 2012 at 6pm
limit"Sun 5 20:46"In the current year of the current month at Sunday the 5th at 8:46pm
Find more examples and how to create time frames at http://labix.org/python-dateutil.
The info subcommand shows all the parameters which are
covered by the crm report:
crm(live)history#infoSource: live Period: 2012-01-12 14:10:56 - end Nodes: alice Groups: Resources:
To limit crm report to certain parameters view the
available options with the subcommand help.
To narrow down the level of detail, use the subcommand
detail with a level:
crm(live)history#detail1
The higher the number, the more detailed your report will be. Default is
0 (zero).
After you have set above parameters, use log to show
the log messages.
To display the last transition, use the following command:
crm(live)history#transition-1 INFO: fetching new logs, please wait ...
This command fetches the logs and runs dotty (from the
graphviz package) to show the
transition graph. The shell opens the log file which you can browse with
the ↓ and ↑ cursor keys.
If you do not want to open the transition graph, use the
nograph option:
crm(live)history#transition-1 nograph
7.8 For More Information #
The crm man page.
Visit the upstream project documentation at http://crmsh.github.io/documentation.
See Article “Highly Available NFS Storage with DRBD and Pacemaker” for an exhaustive example.