Jan 18, 2010 7:16 PM
2.4.5 memory leak?
-
Like (0)
Anyone have issues with a memory leak in Zenoss? I run 2.4.5 on a 32-bit CentOS 5.3 dual-processor system with 4 GB of RAM. LIke clockwork, within about 2 weeks of booting the system, free swap falls to about 500K. When it gets below that threshold, it starts behaving erratically (no alarms, graphs not updated). A reboot puts the free swap back to about 2 GB and behavior returns to normal. I've attached a picture showing free swap and free memory. Free memory is always at about 50 MB (not GB).
This is system is quite substantially over-utilized. Load is usually somewhere between 2 and 4 for most of the say. Performace graphs have a number of missing spots every day.
Thoughts?
Matt
Just to add a few details which are missing from the post: the server has a 2.2GHz processor and he is monitoring very close to 600 devices (about 20000 datapoints). The system is overall underpowered. I explained to you last night that collecting performance data for that many devices on a single Zenoss collector is very performance intensive and that you need a server which will be able to handle the load. I also explained that Zenoss always starts off using a small amount of memory and then starts using more and more as it caches stuff. It will eventually reach a point where it will not consume any more memory (hence this is not a leak). You simply don't have enough physical memory in that server to accomodate the memory requirements for monitoring almost 600 devices. 2.2GHz is also very low for that level of monitoring. You need to upgrade the Memory and the CPU for starters. If you still see performance issues then you would need to get faster disks (a SAN or something).
I have a server running Zenoss 2.4.5 which is monitoring 352 devices and around 18000 datapoints. I have 6GB of RAM in the server, but the overall memory usage tops out at around 4.5GB after a few weeks (which would explain why you, having only 4GB of RAM are seeing swap usage after Zenoss has been running for a couple of weeks).
Start by adding memory to the server, if you still experience the issue after you upgrade the RAM (which I'll bet that you won't), then post again with your findings. 2.2GHz is not enough to be monitoring that many devices, get a better processor in there as well.
Many thanks, Ryan, for following up our IRC discussion in this forum. I think you've answered my questions thoroughly and correctly. As I said last night, I will start by adding more RAM (from 4 GB to 8 GB) and see how it goes. I will move to a more powerful CPU when practical and I have the option, if warranted, of moving to a SAN volume which should help performance if RAM and CPU aren't enough.
Matt
If you continue to have issues after upgrading the RAM be sure to make it known. If there is in fact some memory leak which we're not aware of it would be good to be able to nail down the cause.
Hi, Ryan.
I've brought this issue up before (running 2.4.2, source install)
Right now I'm trying to wrap my head around the memory usage in terms of how/why it is continually chewing up more memory over time until a "certain point is reached".. What exactly is Zenoss caching? In the case of zenperfsnmp, it would seem that each object (device or datapoint?) represents a relatively large amount of memory used (memory / objects). What data is it holding on to? Each datapoint is written to RRD file every cycle.. is Zenoss storing all old datapoint values in memory?..
Take these examples:
NOTES:
All Zenoss daemons restarted on the 20th
Ignore graph breaks - unrelated issue!
Looking at Collector #2, you can see that at one point the system (solely running Zenoss daemons) was using >= 9 GB of combined RAM+SWAP. For 850 devices (35,000 dp)! If you calculate memory / dp you get ~ 250kB per dp. With each RRD file at only 35kB, what is being cached?
Any insights would be appreciated! Thanks!!
I've been looking in to this. I recently edited my zope.conf and dropped the zthread setting down from 50 to 4 on one of my servers. It appears to have not only improved performance but also seems to have had a positive effect on the amount of memory usage. I'm going to leave it running to see if it eventually creeps in to swap. Also, you mentioned zenperfsnmp using up a lot of memory, have you checked Zope? On mine it's Zope that uses up the vast majority of the memory. Zenperfsnmp comes in second with about 15% usage.
I'm glad you asked..
Here are the stats on my Hub server. It runs Zope, MySQL,and perf for 30 devices/13k dp. I seem to have to restart Zenoss about every 10 days. After that point, Zope goes crazy and spirals out of control, resource-usage-wize.
Zope & Zenhubworkers are killing the server. These stats are from "ps" and thus more of an average over time than current-moment stats. "top", for example, routinely has Zenhubworkers at 95-100% CPU usage.
Zenhub config:
#PARAMETER VALUE
workers 3
cachesize 4000
pcachesize 2250
From Zope config:
zodb_db main>
mount-point /
cache-size 5000
pool-size 50
<zeoclient>
server localhost:8100
storage 1
name zeostorage
var $INSTANCE/var
cache-size 20MB
</zeoclient>
</zodb_db>
Load Average:
CPU Usage:
Real Memory Usage:
Swap Memory Usage:
Zenoss CPU Usage: (% total system)
Zenoss Memory Usage: (% total system RAM)
Did you create the CPU and Memory usage graphs yourself?
Yes, I created a Perl script that grabs "ps" statistics. Net-SNMP then queries that script.
I can post it if desired. Consists of perf template (took forever to create), Net-SNMPd config line, and the Perl script.
Ah, I thought they were some default graphs which I was missing or something. It'd be nice if they'd include graphs like that for each collector by default.
You should really make that in to a ZenPack. It would be very useful.
Will do. I've never created a ZenPack with anything other than perf templates in it, so I don't know how to have it include the script. Will post them all separately in a Document here shortly.
Email me your results or if you need help and I'll get it posted.
Thanks,
Matt Ray
Zenoss Community Manager
Follow Us On Twitter »
|
Latest from the Zenoss Blog » | Community | Products | Services Resources | Customers Partners | About Us | ||
Copyright © 2005-2011 Zenoss, Inc.
|
||||||||