Zenoss Core Performance Tuning
The purpose of this article is to highlight various steps a Zenoss administrator can take to resolve performance issues. Zenoss performance issues can have the following symptoms:
- slowness of the web ui (event console, status page)
- frequent heartbeat failures
- gaps in performance graphs
- slow startup/restart times of Zenoss
The above symptoms can be caused by any number of reasons. Here are the various areas that should be investigated when trying to tune a Zenoss deployment.
System Resources
The following system resource requirements assume the following:
- a mix of devices is being monitored (Windows, Unix, routers, switches, firewalls)
- events will be removed from the database after a reasonable amount of time (6-12mo)
- no device is spewing an enormous amount of events at Zenoss (syslog or traps)
Deployment Size: S (0-250 Devices)
4GB RAM
2 CPU Cores
300GB, 10K RPM Drives
Deployment Size: M (250-500 Devices)
4-8GB RAM
4 CPU Cores
300GB, 10K RPM Drive
Deployment Size: L (500-1000)
8GB RAM
4-8 CPU Cores
75GB, 15K RPM Drives
300GB, 10K RPM Drives
Deployment Size: XL (1000-1500)
16GB RAM
8 CPU Cores
75GB, 15K RPM Drives
300GB, 10K RPM Drives
Filesystem Tuning
Zenoss stores all its performance data in RRD files. As a result, Zenoss constantly has to access a great number of files during performance collection. These performance updates are 8 bytes per datapoint, which translates into 4KB filesystem blocks. Under such a high volume/low throughput usage pattern, journaled filesystems can be very bad for I/O performance. If possible, Zenoss servers should have a non-journaled filesystem partition for performance data (located under $ZENHOME/perf).
If the setup of a separate partition/filesystem is not possible we recommend the use of the options below:
defaults,noatime,nodiratime,data=writeback,commit=100
These options should be placed into the /etc/fstab file and will take effect after rebooting the system. For more information on filesystem performance tuning with regards to increasing RRD performance, please see: http://oss.oetiker.ch/rrdtool-trac/wiki/TuningRRD
MySQL Tuning
Assuming the above system resources are met, MySQL often becomes the first bottleneck. This is due to the default MySQL configuration being a “starter only” configuration, and not intended for production use. When a Zenoss system is deployed for the first time, it is easy to underestimate the event load the system will have to deal with. Considering that Zenoss receives events from syslog, SNMP traps, and Windows event log, event load can easily add up to multiple gigabytes per week or even per day. It is therefore important to limit both the number of received events as well as the number of events retained in the database. This can be done by dropping unneeded events and configuring jobs to age events periodically.
Since Zenoss stores all event-based data in the MySQL database, MySQL performance impacts the entire Zenoss UI. The Zenoss administrator should use a configuration that allows MySQL to more fully utilize the hardware resources of the Zenoss server. The following configuration is a good starting point:
[mysqld]
user=mysql
innodb_file_per_table
# Default to using old password format for compatibility with mysql
# 3.x clients (those using the mysqlclient10 compatibility package).
old_passwords=1
skip_locking
innodb_buffer_pool_size = [see below]
innodb_additional_mem_pool_size = 32M
innodb_log_buffer_size = 2M
innodb_flush_method = O_DIRECT
innodb_fast_shutdown = 1
innodb_flush_log_at_trx_commit = 2
innodb_thread_concurrency = 4
query_cache_size = 32M
innodb_buffer_pool_size
should be set to the following:
System Size | innodb_buffer_pool_size |
S | 512M |
M | 512M or 1024M (depending on RAM size) |
L | 1024M |
XL | 2048M |
XXL | 4096M |
Note: For most Zenoss dedicated MySQL installations the above configuration will work as shown; others might require additional options added to guarantee proper functioning of the database. Before deploying the configuration above please create a backup of your current configuration located under /etc/my.cnf. Deploy the configuration to /etc/my.cnf and restart the MySQL server.
For administrators that want to take further steps to optimize MySQL we recommend the use of the excellent MySQLTuner script to be found at http://wiki.mysqltuner.com/ MySQLTuner
Zope Performance
Zope is the application server that runs the Zenoss web UI. In combination with ZEO, user experience mostly hinges on proper performance tuning of the Zope-ZEO combo. Unfortunately, Zope/ZEO tunning is not an exact science. Optimal performance might require several iterations. However, the settings below will help to improve performance:
Search and configure the following settings in $ZENHOME/etc/zope.conf
Default setting: cache-size 5000
S and M systems: cache-size 50000
L and XL systems: cache-size 100000 (or more)
The recommendations made here assume you followed the hardware recommendations above. This setting will most likely have the single biggest impact on UI performance. It is related to how many objects are in the ZEO database, which in turn is a function of how many and what type of devices are in the system. If you log in as the admin user and visit the following URL:
http://YOUR_SERVER:8080/Control_Panel/Database/main/ manage_cacheParameters
you will be able to see how many objects are in your database. Please note that the cache-size setting actually specifies the number of objects. This means that it might be difficult to gauge how much memory will actually be consumed. This can only be determined after Zope has been running by actually looking at ps/top output. Also, the cache is shared by all zserver-threads (see below), which means that if you increase the number of zserver-threads it is usually recommended to also increase the cache-size. The above recommendation assumes you don't run more than 6 threads.
Default setting: zserver-threads = 4
S and M systems: zserver-threads = 4
L and XL systems: zserver-threads = 10 (also add pool-size 50 after cache-size in the configuration file)
This number represents the number of threads Zope uses to serve client requests. It it very important to understand that more is not equals better for this setting; on the contrary, some applications perform better with fewer threads. Also note that increasing the zserver-threads above 7 will require an increase in pool-size (see above).
ZEO Performance
ZEO (Zope Enterprise Objects) is the database behind most of the Web UI. As mentioned above, ZEO and Zope both have to be tuned in order to achieve optimal UI performance. The data behind the ZEO database is stored in the $ZENHOME/var/Data.fs file. To ensure proper performance it is vital that this file be located on a fast filesystem.
Zeopack
The ZEO database keeps tracks of all transactions that are performed, this means that the database file will grow over time to become quite large. A larger file will slow down overall access times, we therefore advice to “pack” this file frequently. Packing means that old transactions are removed from the data store, resulting in a smaller file and therefore faster access. We generally recommend that the following job be run on a set schedule via cron to be run by the zenoss user. Also, it is important to combine the packing with a backup prior to the packing, since the data is changed during packing.
#run pack every monday morning at 2am
0 2 * * 1 bash -lc “$ZENHOME/bin/zeopack.py -h localhost -p 8100 >> /PATH/TO/ SOME/logfile.log 2>&1
Data.fs on ramdisk
The ultimate speed-up can be achieved if one configures the Data.fs file to be located in a ramdisk. Note: The setup of the ramdisk and making it persistent to bring it back after reboot is outside the scope of this article. Given you have an available ramdisk, you can simply shutdown Zenoss delete any non Data.fs files (index, lock, tmp). Edit the $ZENHOME/etc/zeo.conf file and change the <filestorage 1> section to look as follows:
<filestorage 1>
#old storage location
#path $INSTANCE/var/Data.fs
#used for ramdisk
path /PATH/TO/RAMDISK/Data.fs
</filestorage>
Once you restart, zeo will access the Data.fs file in its new location. Note: since ramdisks are highly volatile it is recommended to perform high frequency backups on top of your regular backups. The following script could be used to do this:
#!/bin/bash
#this script creates backups and also handles cleanups #it is used to take care of the ramdisk
#directory for the backups BLOC=$ZENHOME/backups/rapids/
#create backup filename so we don't conflict
FILENAME=${BLOC}'rapidBackup.'`date +%F-%H-%M`'.tgz' /opt/zenoss/bin/zenbackup --no-eventsdb --no-perfdata --temp-dir=/apps/ zenoss/backups_tmp --file=${FILENAME}
#delete all backups older than 6h find ${BLOC} -mmin +720 -delete
The above should be run every 30min or even more often, and the latest backup can be used to restore data after sudden power failures. It is important to understand that the use of a ramdisk can potentially cause data loss; however, since after initial Zenoss configuration, most data contained within ZEO does not change constantly, it is a small risk to take, especially since at most 30min of data will be lost.
-
Like (0)