This is a draft document that was built and uploaded automatically. It may document beta software and be incomplete or even incorrect. Use this document at your own risk.
Regular operation and maintenance includes:
Configuring data retention for the InfluxDB database. This can be configured in the monasca barclamp. For details, see Deployment Guide using Crowbar.
Configuring data retention for the Elasticsearch database. This can be configured in the monasca barclamp. For details, see Deployment Guide using Crowbar.
Removing metrics data from the InfluxDB database.
Removing log data from the Elasticsearch database.
Handling log files of agents and services.
Backup and recovery of databases, configuration files, and dashboards.
Metrics data is stored in the Metrics and Alarms InfluxDB Database. InfluxDB features an SQL-like query language for querying data and performing aggregations on that data.
The Metrics Agent configuration defines the metrics and types of measurement for which data is stored. For each measurement, a so-called series is written to the InfluxDB database. A series consists of a timestamp, the metrics, and the value measured.
Every series can be assigned key tags. In the case of SUSE OpenStack Cloud Crowbar Monitoring,
this is the _tenant_id tag.
This tag identifies the OpenStack project for which the metrics data has been
collected.
From time to time, you may want to delete outdated or unnecessary metrics data from the Metrics and Alarms Database, for example, to save space or remove data for metrics you are no longer interested in. To delete data, you use the InfluxDB command line interface, the interactive shell that is provided for the InfluxDB database.
Proceed as follows to delete metrics data from the database:
Create a backup of the database.
Determine the ID of the OpenStack project for the data to be deleted:
Log in to the OpenStack dashboard and go to Identity
> Projects!m. The monasca project initially
provides all metrics data related to SUSE OpenStack Cloud Crowbar Monitoring.
In the course of the productive operation of SUSE OpenStack Cloud Crowbar Monitoring, additional projects may be created, for example, for application operators.
The Project ID field shows the relevant tenant ID.
Log in to the host where the Monitoring Service is installed.
Go to the directory where InfluxDB is installed:
cd /usr/bin
Connect to InfluxDB using the InfluxDB command line interface as follows:
./influx -host <host_ip>
Replace <host_ip> with the IP address of the
machine on which SUSE OpenStack Cloud Crowbar Monitoring is
installed.
The output of this command is, for example, as follows:
Connected to http://localhost:8086 version 1.1.1 InfluxDB shell version: 1.1.1
Connect to the InfluxDB database of SUSE OpenStack Cloud Crowbar Monitoring
(mon):
> show databases name: databases name ---- mon _internal > use mon Using database mon
Check the outdated or unnecessary data to be deleted.
You can view all measurements for a specific project as follows:
SHOW MEASUREMENTS WHERE _tenant_id = '<project ID>'
You can view the series for a specific metrics and project, for example, as follows:
SHOW SERIES FROM "cpu.user_perc" WHERE _tenant_id = '<project ID>'
Delete the desired data.
When a project is no longer relevant or a specific tenant is no longer used, delete all series for the project as follows:
DROP SERIES WHERE _tenant_id = '<project ID>'
Example:
DROP SERIES WHERE _tenant_id = '27620d7ee6e948e29172f1d0950bd6f4'
When a metrics is no longer relevant for a project, delete all series for the specific project and metrics as follows:
DROP SERIES FROM "<metrics>" WHERE _tenant_id = '<project ID>'
Example:
DROP SERIES FROM "cpu.user_perc" WHERE _tenant_id = '27620d7e'
Restart the influxdb service, for example, as follows:
sudo systemctl restart influxdb
Log data is stored in the Elasticsearch database. Elasticsearch stores the data in indices. One index per day is created for every OpenStack project.
By default, the indices are stored in the following directory on the host where the Monitoring Service is installed:
/var/data/elasticsearch/<cluster-name>/nodes/<node-name>
Example:
/var/data/elasticsearch/elasticsearch/nodes/0
If your system uses a different directory, look up the
path.data parameter in the Elasticsearch configuration
file, /etc/elasticsearch/elasticsearch.yml.
If you want to delete outdated or unnecessary log data from the Elasticsearch database, proceed as follows:
Make sure that curl is installed. If this is not the
case, install the package with
sudo zypper in curl
Create a backup of the Elasticsearch database.
Determine the ID of the OpenStack project for the data to be deleted:
Log in to the OpenStack dashboard and go to Identity
> Projects. The monasca project initially
provides a ll metrics data related to SUSE OpenStack Cloud Crowbar Monitoring.
In the course of the productive operation of SUSE OpenStack Cloud Crowbar Monitoring, additional projects may be created.
The Project ID field shows the relevant ID.
Log in to the host where the Monitoring Service is installed.
Make sure that the data you want to delete exists by executing the following command:
curl -XHEAD -i 'http://localhost:<port>/<projectID-date>'
For example, if Elasticsearch is listening at port 9200 (default), the ID
of the OpenStack project is abc123, and you want to
check the index of 2015, July 1st, the command is as follows:
curl -XHEAD -i 'http://localhost:9200/abc123-2015-07-01'
If the HTTP response is 200, the index exists; if the
response is 404, it does not exist.
Delete the index as follows:
curl -XDELETE -i 'http://localhost:<port>/<projectID-date>'
Example:
curl -XDELETE -i 'http://localhost:9200/abc123-2015-07-01'
This command either returns an error, such as
IndexMissingException, or acknowledges the successful
deletion of the index.
Be aware that the -XDELETE command immediately deletes
the index file!
Both, for -XHEAD and -XDELETE, you can
use wildcards for processing several indices. For example, you can delete all
indices of a specific project for the whole month of July, 2015:
curl -XDELETE -i 'http://localhost:9200/abc123-2015-07-*'
Take extreme care when using wildcards for the deletion of indices. You could delete all existing indices with one single command!
In case of trouble with the SUSE OpenStack Cloud Crowbar Monitoring
services, you can study their log files to find the reason. The log files are
also useful if you need to contact your support organization. For storing the
log files, the default installation uses the /var/log
directory on the hosts where the agents or services are installed.
You can use systemd, a system and session manager for
LINUX, and journald, a LINUX logging interface, for
addressing dispersed log files.
The SUSE OpenStack Cloud Crowbar Monitoring installer automatically
puts all SUSE OpenStack Cloud Crowbar Monitoring services under the
control of systemd. journald provides a
centralized management solution for the logging of all processes that are
controlled by systemd. The logs are collected and managed
in a so-called journal controlled by the journald daemon.
For details on the systemd and journald
utilities, refer to the
https://documentation.suse.com/sles/15-SP1/single-html/SLES-admin/#part-system.
Typical tasks of the Monitoring Service operator are to make regular backups, particularly of the data created during operation.
At regular intervals, you should make a backup of all:
Databases.
Configuration files of the individual agents and services.
Monitoring and log dashboards you have created and saved.
SUSE OpenStack Cloud Crowbar Monitoring does not offer integrated backup and recovery mechanisms. Instead, use the mechanisms and procedures of the individual components.
You need to create regular backups of the following databases on the host where the Monitoring Service is installed:
Elasticsearch database for historic log data.
InfluxDB database for historic metrics data.
MariaDB database for historic configuration information.
It is recommended that backup and restore operations for databases are carried out by experienced operators only.
Before backing up and restoring a database, we recommend stopping the Monitoring
API and the Log API on the monasca-server node, and check that all data is processed.
This ensures that no data is written to a database during a backup and restore operation.
After backing up and restoring a database, restart the APIs.
To stop the Monitoring API and the Log API, use the following command:
systemctl stop apache2
To check that all Kafka queues are empty, list the existing consumer groups and check the LAG column for each group. It should be 0. For example:
kafka-consumer-groups.sh --zookeeper 192.168.56.81:2181 --list kafka-consumer-groups.sh --zookeeper 192.168.56.81:2181 --describe \ --group 1_metrics | column -t -s ',' kafka-consumer-groups.sh --zookeeper 192.168.56.81:2181 --describe \ --group transformer-logstash-consumer | column -t -s ',' kafka-consumer-groups.sh --zookeeper 192.168.56.81:2181 --describe \ --group thresh-metric | column -t -s ','
To restart the Monitoring API and the Log API, use the following command:
systemctl start apache2
For backing up and restoring your Elasticsearch database, use the Snapshot and Restore module of Elasticsearch.
To create a backup of the database, proceed as follows:
Make sure that curl is installed, zypper in curl.
Log in to the host where the Monitoring Service is installed.
Create a snapshot repository. You need the Elasticsearch bind address for
all commands. run grep network.bind_host /etc/elasticsearch/elasticsearch.yml
to find the bind address, and replace IP in the
following commands with this address. For example:
curl -XPUT http://IP:9200/_snapshot/my_backup -d '{
"type": "fs",
"settings": {
"location": "/mount/backup/elasticsearch1/my_backup",
"compress": true
}
}'
The example registers a shared file system repository ("type":
"fs") that uses the
/mount/backup/elasticsearch1 directory for storing
snapshots.
The directory for storing snapshots must be configured in the elasticsearch/repo_dir
setting in the monasca barclamp (see the section called “Deploying monasca (Optional)”).
The directory must be manually mounted before creating the snapshot. The
elasticsearch user must be specified as the owner of the directory.
compress is turned on to compress the metadata files.
Check whether the repository was created successfully:
curl -XGET http://IP:9200/_snapshot/my_backupThis example response shows a successfully created repository:
{
"my_backup": {
"type": "fs",
"settings": {
"compress": "true",
"location": "/mount/backup/elasticsearch1/my_backup"
}
}
}Create a snapshot of your database that contains all indices. A repository can contain multiple snapshots of the same database. The name of a snapshot must be unique within the snapshots created for your database, for example:
curl -XPUT http://IP:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true
The example creates a snapshot named snapshot_1 for all
indices in the my_backup repository.
To restore the database instance, proceed as follows:
Close all indices of your database, for example:
curl -XPOST http://IP:9200/_all/_closeRestore all indices from the snapshot you have created, for example:
curl -XPOST http://IP:9200/_snapshot/my_backup/snapshot_1/_restore
The example restores all indices from snapshot_1 that is
stored in the my_backup repository.
For additional information on backing up and restoring an Elasticsearch database, refer to the Elasticsearch documentation.
For backing up and restoring your InfluxDB database, you can use the InfluxDB
shell. The shell is part of your InfluxDB distribution. If you installed
InfluxDB via a package manager, the shell is, by default, installed in the
/usr/bin directory.
To create a backup of the database, proceed as follows:
Log in to the InfluxDB database as a user who is allowed to run the
influxdb service, for example:
su influxdb -s /bin/bash
Back up the database, for example:
influxd backup -database mon /mount/backup/mysnapshot
monasca is using mon as the name of the database
The example creates the backup for the database in
/mount/backup/mysnapshot.
Before restoring the database, make sure that all database processes are shut down. To restore the database, you can then proceed as follows:
If required, delete all files not included in the backup by dropping the database before you carry out the restore operation. A restore operation restores all files included in the backup. Files created or merged at a later point in time are not affected. For example:
influx -host IP -execute 'drop database mon;'
Replace IP with the IP address that the database
is listening to. You can run influxd config and look up
the IP address in the [http] section.
Stop the InfluxDB database service:
systemctl stop influxdb
Log in to the InfluxDB database as a user who is allowed to run the
influxdb service:
su influxdb -s /bin/bash
Restore the metastore:
influxd restore -metadir /var/opt/influxdb/meta /mount/backup/mysnapshot
Restore the database, for example:
influxd restore -database mon -datadir /var/opt/influxdb/data /mount/backup/mysnapshot
The example restores the backup from
/mount/backup/mysnapshot to
/var/opt/influxdb/influxdb.conf.
Ensure that the file permissions for the restored database are set correctly:
chown -R influxdb:influxdb /var/opt/influxdb
Start the InfluxDB database service:
systemctl start influxdb
For additional information on backing up and restoring an InfluxDB database, refer to the InfluxDB documentation.
For backing up and restoring your MariaDB database, you can use the mysqldump utility program. mysqldump performs a logical backup that produces a set of SQL statements. These statements can later be executed to restore the database.
To back up your MariaDB database, you must be the owner of the database or a user with superuser privileges, for example:
mysqldump -u root -p mon > dumpfile.sql
In addition to the name of the database, you have to specify the name and the location where mysqldump stores its output.
To restore your MariaDB database, proceed as follows:
Log in to the host where the Monitoring Service is installed as a user with root privileges.
Make sure that the mariadb service is running:
systemctl start mariadb
Log in to the database you have backed up as a user with root privileges, for example:
mysql -u root -p mon
Remove and then re-create the database:
DROP DATABASE mon; CREATE DATABASE mon;
Exit mariadb:
\q
Restore the database, for example:
mysql -u root -p mon < dumpfile.sql
For additional information on backing up and restoring a MariaDB database
with mysqldump, refer to the
MariaDB
documentation.
Below you find a list of the configuration files of the agents and the individual services included in the Monitoring Service. Back up these files at least after you have installed and configured SUSE OpenStack Cloud Crowbar Monitoring and after each change in the configuration.
/etc/influxdb/influxdb.conf /etc/kafka/server.properties /etc/my.cnf /etc/my.cnf.d/client.cnf /etc/my.cnf.d/mysql-clients.cnf /etc/my.cnf.d/server.cnf /etc/monasca/agent/agent.yaml /etc/monasca/agent/conf.d/* /etc/monasca/agent/supervisor.conf /etc/monasca/api-config.conf /etc/monasca/log-api-config.conf /etc/monasca/log-api-config.ini /etc/monasca-log-persister/monasca-log-persister.conf /etc/monasca-log-transformer/monasca-log-transformer.conf /etc/monasca-log-agent/agent.conf /etc/monasca-notification/monasca-notification.yaml /etc/monasca-persister/monasca-persister.yaml /etc/monasca-thresh/thresh.yaml /etc/elasticsearch/elasticsearch.yml /etc/elasticsearch/logging.yml /etc/kibana/kibana.yml
If you need to recover the configuration of one or more agents or services, the recommended procedure is as follows:
If necessary, uninstall the agents or services, and install them again.
Stop the agents or services.
Copy the backup of your configuration files to the correct location according to the table above.
Start the agents or services again.
Kibana can persist customized log dashboard designs to the Elasticsearch database, and allows you to recall them. For details on saving, loading, and sharing log management dashboards, refer to the Kibana documentation.
Grafana allows you to export a monitoring dashboard to a JSON file, and to re-import it when necessary. For backing up and restoring the exported dashboards, use the standard mechanisms of your file system. For details on exporting monitoring dashboards, refer to the Getting Started tutorial of Grafana.