Jul 28, 2009 2:16 PM
New 36G, 16CPU server
-
Like (0)
"coyote" wrote:
The Senior Linux admin gave me a new Zenoss server with 16 CPUs, 36G of ram and lots of disk space.
I have Zenoss 2.4.2 installed and am trying to monitor 1800 devices.
Zenoss uses less that 12Gigs of ram and load is always less than 2. I have configured Zenhub with 8 workers.
vm.vfs_cache_pressure = 0
vm.swappiness = 20
vm.overcommit_memory = 2
vm.overcommit_ratio=75
vm.dirty_background_ratio = 1
vm.dirty_ratio = 100
vm.dirty_expire_centisecs = 3000
vm.dirty_writeback_centisecs = 500
echo 0 > /sys/block/cciss!c0d0/queue/iosched/front_merges
echo 150 > /sys/block/cciss!c0d0/queue/iosched/read_expire
echo 1500 > /sys/block/cciss!c0d0/queue/iosched/write_expire
<zodb_db temporary>
# Temporary storage database (for sessions)
<temporarystorage>
name temporary storage for sessioning
</temporarystorage>
mount-point /temp_folder
container-class Products.TemporaryFolder.TemporaryContainer
cache-size 500000
pool-size 16
</zodb_db>
<zodb_db main>
mount-point /
# ZODB cache, in number of objects
cache-size 500000
pool-size 16
<zeoclient>
server localhost:8100
storage 1
name zeostorage
var $INSTANCE/var
# ZEO client cache, in bytes
cache-size 10000MB
# Uncomment to have a persistent disk cache
#client zeo1
</zeoclient>
</zodb_db>
# Reseting IPs for all Devices
for d in dmd.Devices.getSubDevices():
d.setManageIp()
# Commiting changes for all devices
commit()
# Reseting IPs for all Devices
for d in dmd.Devices.getSubDevices():
# Reset IP and commit change
d.setManageIp()
commit()
tmpfs /tmp tmpfs rw 0 0
"mwcotton" wrote:
Whats your disk config? I have found the biggest thing that slows the server down is wait on i/o . If you got all that extra ram maybe you should look at configuring a ram drive and copy all your rrd's up there, of course occasionly rsync them back to the hard disks. I bet the system would fly then.
"coyote" wrote:
I have 6 SATA drives in a RAID 5 configuration, not the best performance but it is what the admin wants.
I can back off the monitoring and performance collection if I have too. sda8 is where Zenoss is installed.
# iostat
Linux 2.6.18-128.1.16.el5 (zmaster) 07/28/2009
avg-cpu: %user %nice %system %iowait %steal %idle
0.75 0.00 0.08 0.01 0.00 99.15
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 25.28 3.24 632.88 5326012 1040210946
sda1 0.00 0.02 0.00 25980 602
sda2 0.00 0.00 0.00 1877 0
sda3 0.31 0.40 5.03 652578 8261584
sda4 0.00 0.00 0.00 12 0
sda5 6.52 0.21 185.22 336968 304432832
sda6 0.15 1.50 2.63 2465424 4322896
sda7 1.08 0.00 261.88 6096 430430360
sda8 17.21 1.11 177.97 1828235 292514760
sda9 0.00 0.00 0.15 3619 246896
sda10 0.00 0.00 0.00 4607 1016
# Fix deviceSearch
brains = dmd.Devices.deviceSearch()
for d in brains:
try:
bah = d.getObject()
except Exception:
print "Removing non-existent device from deviceSearch: " + d.getPath()
dmd.Devices.deviceSearch.uncatalog_object(d.getPath())
commit()
# Fix componentSearch
brains = dmd.Devices.componentSearch()
for d in brains:
try:
bah = d.getObject()
except Exception:
print "Removing non-existent device from componentSearch: " + d.getPath()
dmd.Devices.componentSearch.uncatalog_object(d.getPath())
commit()
dmd.Devices.reIndex()
commit()
reindex()
commit()
We encountered issues with history delete not working as well. I looked through the Zenoss ticket queue and heard that this was slated for 2.5.2.Can someone tell me when this is slated to be fixed? We are getting ready to roll our a production version of Zenoss and I would like to install the latest stable version so I do not have to worry about upgrading after the install.
One more item. The problem we run into with MySQL is that once the data grows, it is difficult to reclaim space once history delete, optimize, etc. are run. We were informed that an export and import of the database is required to reclaim disk space. Has anyone run into this issue before?
I have a few questions regarding your environment. We plan to deploy a similar sizing to your environment and I am trying to any leverage lessons learned here.
I would appreciate any information you can provide.
Thank you,
- Ken
It sounds like there are a couple of folks, at least, that are looking for performance/tuning ideas. I'll try to give a run-down of what we have running, and will do my best to field follow up questions.
Our "Primary" zenoss server -- runs pretty much everything -- MySql, Zope DB, UI, monitoring, modeling (nightly from cron), performance collection.
This server is collecting perf data on about 250 devices.
It is Ubuntu, quad 2.5 Ghz, with 8GB. Also has SAS disks, just mirrored, no raid.
All traps, etc come directly into this server, none to the collectors
Our "Large Collector" -- collects for about 1400 devices
Also Ubuntu, but a Dual-Core 3.2 Ghz, also with 8GB. This is an older box with SCSI disks.
It also does modeling each night (I run modeler from cron--not as a daemon--once/day is enough for us)
Since this thing pretty much runs collection around the clock, I went for spindles. Like others have said, RAID 5 is really going to put a damper on things...
I went even a step further....
Not much goes into $ZENHOME/perf--it's on the same file system as the mysql, zope, code, root, boot, etc.
I setup 8 smaller disks into (4) 2-disk bundles named
/perf1
/perf2
/perf3
/perf4
Zenoss looks in $ZENHOME/perf for the rrd directories, but I have them linked over to the /perfN disks--splitting devices somewhat equally. So, when writing/updating RRD tables, performance looks something like this:
write/sec avg/wait %util
07:56:59 PM dev104-16 541.33 0.00 4698.67 8.68 108.23 197.79 1.85 100.00
07:56:59 PM dev104-32 478.33 0.00 4288.00 8.96 108.06 224.59 2.09 100.00
07:56:59 PM dev104-48 571.00 0.00 4805.33 8.42 107.82 189.89 1.75 100.00
07:56:59 PM dev104-64 719.67 0.00 6525.33 9.07 108.16 145.74 1.39 100.00
Follow Us On Twitter »
|
Latest from the Zenoss Blog » | Community | Products | Services Resources | Customers Partners | About Us | ||
Copyright © 2005-2011 Zenoss, Inc.
|
||||||||