Archived community.zenoss.org | full text search
Skip navigation
81111 Views 13 Replies Latest reply: Jan 26, 2010 3:43 PM by gramik RSS
mwoodling Rank: White Belt 46 posts since
Aug 10, 2008
Currently Being Moderated

Jan 18, 2010 7:16 PM

2.4.5 memory leak?

Anyone have issues with a memory leak in Zenoss?  I run 2.4.5 on a 32-bit CentOS 5.3 dual-processor system with 4 GB of RAM.  LIke clockwork, within about 2 weeks of booting the system, free swap falls to about 500K.  When it gets below that threshold, it starts behaving erratically (no alarms, graphs not updated).  A reboot puts the free swap back to about 2 GB and behavior returns to normal.  I've attached a picture showing free swap and free memory.  Free memory is always at about 50 MB (not GB).

 

This is system is quite substantially over-utilized.  Load is usually somewhere between 2 and 4 for most of the say.  Performace graphs have a number of missing spots every day.

 

Thoughts?

 

Zenoss-swap-mem.png

Zenoss-free-mem.png

 

Matt

  • Ryan Matte ZenossMaster 653 posts since
    Mar 26, 2009
    Currently Being Moderated
    1. Jan 19, 2010 8:54 AM (in response to mwoodling)
    Re: 2.4.5 memory leak?

    Just to add a few details which are missing from the post: the server has a 2.2GHz processor and he is monitoring very close to 600 devices (about 20000 datapoints).  The system is overall underpowered.  I explained to you last night that collecting performance data for that many devices on a single Zenoss collector is very performance intensive and that you need a server which will be able to handle the load.  I also explained that Zenoss always starts off using a small amount of memory and then starts using more and more as it caches stuff.  It will eventually reach a point where it will not consume any more memory (hence this is not a leak).  You simply don't have enough physical memory in that server to accomodate the memory requirements for monitoring almost 600 devices.  2.2GHz is also very low for that level of monitoring.  You need to upgrade the Memory and the CPU for starters.  If you still see performance issues then you would need to get faster disks (a SAN or something).

     

    I have a server running Zenoss 2.4.5 which is monitoring 352 devices and around 18000 datapoints.  I have 6GB of RAM in the server, but the overall memory usage tops out at around 4.5GB after a few weeks (which would explain why you, having only 4GB of RAM are seeing swap usage after Zenoss has been running for a couple of weeks).

     

    Start by adding memory to the server, if you still experience the issue after you upgrade the RAM (which I'll bet that you won't), then post again with your findings.  2.2GHz is not enough to be monitoring that many devices, get a better processor in there as well.

  • Ryan Matte ZenossMaster 653 posts since
    Mar 26, 2009
    Currently Being Moderated
    3. Jan 22, 2010 4:57 PM (in response to mwoodling)
    Re: 2.4.5 memory leak?

    If you continue to have issues after upgrading the RAM be sure to make it known.  If there is in fact some memory leak which we're not aware of it would be good to be able to nail down the cause.

  • gramik Rank: White Belt 40 posts since
    Oct 29, 2008
    Currently Being Moderated
    4. Jan 26, 2010 11:22 AM (in response to Ryan Matte)
    Re: 2.4.5 memory leak?

    Hi, Ryan.

     

    I've brought this issue up before (running 2.4.2, source install)

     

    Right now I'm trying to wrap my head around the memory usage in terms of how/why it is continually chewing up more memory over time until a "certain point is reached"..  What exactly is Zenoss caching?  In the case of zenperfsnmp, it would seem that each object (device or datapoint?) represents a relatively large amount of memory used (memory / objects).  What data is it holding on to? Each datapoint is written to RRD file every cycle.. is Zenoss storing all old datapoint values in memory?..

    Take these examples:


    NOTES:

    All Zenoss daemons restarted on the 20th

    Ignore graph breaks - unrelated issue!

     

     


    Collector 1
    Collector 2
    Devices1,800850
    Data Points50,00035,000
    CPU3.0 Ghz Xeon (x2)3.4 Ghz Xeon (x1)
    RAM4 GB4 GB
    ZenPerfsnmp % RAM used after ~6 days1.4 GB (35%)860 MB (21%)
    Sytem RAM Usagecollector1_zenoss_mem_real.PNGcollector2_zenoss_mem_ram.PNG
    System Swap Usage

    collector1_zenoss_mem_swap.PNG(template changed on 20th -- formerly just showed "available")

    collector2_zenoss_mem_swap.PNG
    Zenoss RAM Usagecollector1_zenoss_memperc.PNGcollector2_zenoss_memperc.PNG

     

    Looking at Collector #2, you can see that at one point the system (solely running Zenoss daemons) was using >= 9 GB of combined RAM+SWAP.  For 850 devices (35,000 dp)!   If you calculate  memory / dp you get ~ 250kB per dp.   With each RRD file at only 35kB, what is being cached?

     

    Any insights would be appreciated!  Thanks!!

  • Ryan Matte ZenossMaster 653 posts since
    Mar 26, 2009
    Currently Being Moderated
    5. Jan 26, 2010 11:29 AM (in response to gramik)
    Re: 2.4.5 memory leak?

    I've been looking in to this.  I recently edited my zope.conf and dropped the zthread setting down from 50 to 4 on one of my servers.  It appears to have not only improved performance but also seems to have had a positive effect on the amount of memory usage.  I'm going to leave it running to see if it eventually creeps in to swap.  Also, you mentioned zenperfsnmp using up a lot of memory, have you checked Zope?  On mine it's Zope that uses up the vast majority of the memory.  Zenperfsnmp comes in second with about 15% usage.

  • gramik Rank: White Belt 40 posts since
    Oct 29, 2008
    Currently Being Moderated
    6. Jan 26, 2010 11:55 AM (in response to Ryan Matte)
    Re: 2.4.5 memory leak?

    I'm glad you asked..

    Here are the stats on my Hub server.  It runs Zope, MySQL,and perf for 30 devices/13k dp.  I seem to have to restart Zenoss about every 10 days.  After that point, Zope goes crazy and spirals out of control, resource-usage-wize.

     

    Zope & Zenhubworkers are killing the server.  These stats are from "ps" and thus more of an average over time than current-moment stats.  "top", for example, routinely has Zenhubworkers at 95-100% CPU usage.

     

    Zenhub config:

    #PARAMETER      VALUE
    workers         3
    cachesize       4000
    pcachesize      2250
    
    

     

    From Zope config:

    zodb_db main>
      mount-point /
      cache-size 5000
      pool-size 50
      <zeoclient>
        server localhost:8100
        storage 1
        name zeostorage
        var $INSTANCE/var
        cache-size 20MB
      </zeoclient>
    </zodb_db>

     

     

    Load Average:

    hub_load.png

    CPU Usage:

    hub_cpu.png

    Real Memory Usage:

    hub_real.png

    Swap Memory Usage:

    hub_swap.png

    Zenoss CPU Usage:  (% total system)

    hub_zenoss_cpu.png

    Zenoss Memory Usage:   (% total system RAM)

    hub_zenoss_memperc.png

  • Ryan Matte ZenossMaster 653 posts since
    Mar 26, 2009
    Currently Being Moderated
    7. Jan 26, 2010 11:58 AM (in response to gramik)
    Re: 2.4.5 memory leak?

    Did you create the CPU and Memory usage graphs yourself?

  • gramik Rank: White Belt 40 posts since
    Oct 29, 2008
    Currently Being Moderated
    8. Jan 26, 2010 12:03 PM (in response to Ryan Matte)
    Re: 2.4.5 memory leak?

    Yes, I created a Perl script that grabs "ps" statistics.  Net-SNMP then queries that script.

     

    I can post it if desired.  Consists of perf template (took forever to create), Net-SNMPd config line, and the Perl script.

  • Ryan Matte ZenossMaster 653 posts since
    Mar 26, 2009
    Currently Being Moderated
    9. Jan 26, 2010 12:05 PM (in response to gramik)
    Re: 2.4.5 memory leak?

    Ah, I thought they were some default graphs which I was missing or something.  It'd be nice if they'd include graphs like that for each collector by default.

  • Ryan Matte ZenossMaster 653 posts since
    Mar 26, 2009
    Currently Being Moderated
    10. Jan 26, 2010 12:06 PM (in response to gramik)
    Re: 2.4.5 memory leak?

    You should really make that in to a ZenPack.  It would be very useful.

  • gramik Rank: White Belt 40 posts since
    Oct 29, 2008
    Currently Being Moderated
    11. Jan 26, 2010 12:11 PM (in response to Ryan Matte)
    Re: 2.4.5 memory leak?

    Will do.  I've never created a ZenPack with anything other than perf templates in it, so I don't know how to have it include the script.  Will post them all separately in a Document here shortly.

  • Matt Ray Rank: Zen Master 2,484 posts since
    Apr 5, 2008
    Currently Being Moderated
    12. Jan 26, 2010 12:32 PM (in response to gramik)
    Re: 2.4.5 memory leak?

    Email me your results or if you need help and I'll get it posted.

     

    Thanks,

    Matt Ray

    Zenoss Community Manager

  • gramik Rank: White Belt 40 posts since
    Oct 29, 2008
    Currently Being Moderated
    13. Jan 26, 2010 3:43 PM (in response to Matt Ray)
    Re: 2.4.5 memory leak?

More Like This

  • Retrieving data ...

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points