Archived community.zenoss.org | full text search
Skip navigation
1 2 Previous Next 8847 Views 22 Replies Latest reply: Sep 8, 2011 7:58 PM by ones_and_zeros RSS
ebogaard Rank: White Belt 41 posts since
Jun 9, 2011
Currently Being Moderated

Sep 3, 2011 11:30 AM

Zenoss 3.2 breaks updating certain graphs

Today I updated from 3.1 to 3.2. This process was quite easy and well described in the documentation.

After the update certain templates don't seem to work anymore. I have problems with Apache, b_fping and the Postix-vis-snmp solution described somewhere else on this forum.

 

The rrd's aren't updated anymore, even though in the zencommand.log (on debug) all seems te be okay:

 

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:27 apache_bytesPerReq.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:27 apache_cpuLoad.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 apache_slotDNSLookup.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:27 apache_slotKeepAlive.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:27 apache_slotLogging.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:27 apache_slotOpen.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:27 apache_slotReadingRequest.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:27 apache_slotSendingReply.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:27 apache_slotWaiting.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:27 apache_totalAccesses.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:27 apache_totalKBytes.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:24 fping_avg.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:24 fping_loss.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:24 fping_max.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:24 fping_min.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:24 fping_rcv.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 fping_xmt.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 laLoadInt15_laLoadInt15.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 laLoadInt1_laLoadInt1.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 laLoadInt5_laLoadInt5.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 ldap_time.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 memAvailReal_memAvailReal.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 memAvailSwap_memAvailSwap.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:28 memBuffer_memBuffer.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:28 memCached_memCached.rrd

drwxr-x---  5 zenoss zenoss  4096 May 19 17:28 os

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:28 postfixBounced_postfixBounced.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:28 postfixQueue_postfixQueue.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:28 postfixReceived_postfixReceived.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:28 postfixRejected_postfixRejected.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:28 postfixRelayed_postfixRelayed.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 13:28 postfixSent_postfixSent.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 ssCpuIdle_ssCpuIdle.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 ssCpuRawWait_ssCpuRawWait.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 ssCpuSystem_ssCpuSystem.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 ssCpuUser_ssCpuUser.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 ssIORawReceived_ssIORawReceived.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 ssIORawSent_ssIORawSent.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 SSL_Check_Days.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:24 sysUpTime_sysUpTime.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:21 ZenJMX Heap Memory_committed.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:21 ZenJMX Heap Memory_used.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:21 ZenJMX Non-Heap Memory_committed.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:21 ZenJMX Non-Heap Memory_used.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:21 ZenJMX Open File Descriptors_OpenFileDescriptorCount.rrd

-rw-r--r--  1 zenoss zenoss 35432 Sep  3 17:21 ZenJMX Thread Count_ThreadCount.rrd

 

As you can see: I updated Zenoss around 13.30 and since then certain rrd aren's updated anymore.

Also memBuffer_memBuffer.rrd and memCached_memCached.rrd don't get updates.

  • dhopp Rank: Green Belt 184 posts since
    Jul 17, 2007
    Currently Being Moderated
    1. Sep 3, 2011 12:50 PM (in response to ebogaard)
    Re: Zenoss 3.2 breaks updating certain graphs

    I'm seeing this to on a test box that I upgraded.  This has happened before:

     

    thread/15940

     

    Googling around it seems it is actually a bug with how rrdtool is compiled and the kernel that your distro is running.  What distro are you using?

     

    --Dennis

  • dhopp Rank: Green Belt 184 posts since
    Jul 17, 2007
    Currently Being Moderated
    3. Sep 3, 2011 3:45 PM (in response to ebogaard)
    Re: Zenoss 3.2 breaks updating certain graphs

    There wasn't a fix per se but a work around.  Adding this cron under the zenoss user:

     

    27 2 * * 6 find /usr/local/zenoss/zenoss/perf/Devices/ -name "*.rrd" -execdir touch '{}' +

     

    I might play around trying to see if there is a way to recompile the rrdtool and rrdupdate that zenoss uses.  However, it is weird that this seems to only happen after an upgrade.

     

    --Dennis

  • dhopp Rank: Green Belt 184 posts since
    Jul 17, 2007
    Currently Being Moderated
    4. Sep 3, 2011 3:46 PM (in response to ebogaard)
    Re: Zenoss 3.2 breaks updating certain graphs

    I also should ask.  Are you still seeing the graphs updated in the UI?  My file timestamps are not updating but the data is definitely being updated when I look at graphs in Zenoss.

     

    The problem with the timestamps not getting updated is zenperfsnmp will remove files older than 30 days and it will mistakenly remove rrds that shouldn't be.

     

    --Dennis

  • dhopp Rank: Green Belt 184 posts since
    Jul 17, 2007
    Currently Being Moderated
    6. Sep 4, 2011 11:35 AM (in response to ebogaard)
    Re: Zenoss 3.2 breaks updating certain graphs

    I looked closer at my data and you are correct.  OIDs that start with a '.' do not seem to be getting updated.  However telling Zenoss to test that datapoint from the UI with or without the '.' returns data so it just seems like it's a problem with zenperfsnmp when it goes to store the data.

     

    In particular the Linux Device template has memBuffer and memCached that start with a '.' and I verified that out of all of the Device datapoints those where the only not being updated and looked closer at the graphs and those datapoints were showing up as nan% which means they were not getting updated.  I removed the '.' and restarted zenoss (probably could have just restarted a daemon or two but figured this would guarantee the new value got read) and the rrd is now being updated.

     

    I agree, this seems like a bug so I opened a ticket:

     

    http://dev.zenoss.org/trac/ticket/7859

     

    --Dennis

  • dhopp Rank: Green Belt 184 posts since
    Jul 17, 2007
    Currently Being Moderated
    8. Sep 4, 2011 3:24 PM (in response to ebogaard)
    Re: Zenoss 3.2 breaks updating certain graphs

    I'm not sure about the f_ping since I don't use it but I'm seeing the same thing with the Apache Monitor.  Running:

     

    zencommand run -v 10

     

    gives me this output:

     

    2011-09-04 14:22:56,550 DEBUG zen.zencommand: Storing slotDNSLookup = 0.0 into Devices/172.18.128.102/apache_slotDNSLookup

    2011-09-04 14:22:56,551 DEBUG zen.RRDUtil: /opt/zenoss/perf/Devices/172.18.128.102/apache_slotDNSLookup.rrd: 0.0

    2011-09-04 14:22:56,551 DEBUG zen.zencommand: RRD save result: 0.0

    2011-09-04 14:22:56,551 DEBUG zen.zencommand: Next command in 9 seconds

    2011-09-04 14:22:56,552 DEBUG zen.zencommand: Received exit code: 3

     

    It seems something is failing but I haven't be able to figure out what yet.  On a 3.1 install I get:

     

    2011-09-04 15:16:00,549 DEBUG zen.zencommand: Storing slotDNSLookup = 0.0 into Devices/172.18.128.102/apache_slotDNSLookup

    2011-09-04 15:16:00,550 DEBUG zen.RRDUtil: /opt/zenoss/perf/Devices/172.18.128.102/apache_slotDNSLookup.rrd: 0.0

    2011-09-04 15:16:00,550 DEBUG zen.zencommand: RRD save result: 0.0

    2011-09-04 15:16:00,550 DEBUG zen.zencommand: Storing totalAccesses = 355.0 into Devices/172.18.128.102/apache_totalAccesses

    2011-09-04 15:16:00,551 DEBUG zen.RRDUtil: /opt/zenoss/perf/Devices/172.18.128.102/apache_totalAccesses.rrd: 355L

    2011-09-04 15:16:00,551 DEBUG zen.zencommand: RRD save result: None

    2011-09-04 15:16:00,552 DEBUG zen.zencommand: Storing slotKeepAlive = 0.0 into Devices/172.18.128.102/apache_slotKeepAlive

     

    And it does the same 3 lines for every datasource defined in the template.  I'm going to do a little more debugging and will open another ticket specific to this issue.  I did a quick search and don't see any existing tickets yet for zencommand

     

    --Dennis

  • dhopp Rank: Green Belt 184 posts since
    Jul 17, 2007
    Currently Being Moderated
    11. Sep 6, 2011 7:04 AM (in response to ebogaard)
    Re: Zenoss 3.2 breaks updating certain graphs

    It's definitely a bug.  I did a fresh install of 3.2 and added one device and bound the Apache template to the device.  It only updates the first value that is returned.  I opened a ticket for it. 

     

    Hopefully there will be a patch fairly quickly for it since monitoring is essentially broken and there is no good work around (besides breaking out the template into a bunch of separate data sources with 1 data point each).

     

    --Dennis

  • omeganon Rank: White Belt 69 posts since
    Jun 23, 2011
    Currently Being Moderated
    12. Sep 6, 2011 10:54 AM (in response to dhopp)
    Re: Zenoss 3.2 breaks updating certain graphs

    Confirming that I see this too. Ubuntu 8.0.4, 3.2.0 stack install.

  • slashmili Rank: White Belt 3 posts since
    Jun 16, 2011
    Currently Being Moderated
    13. Sep 7, 2011 3:27 AM (in response to ebogaard)
    Re: Zenoss 3.2 breaks updating certain graphs

    And apache-garph issue is realetd to zenoss/Products/ZenRRD/parsers/Auto.py

    if you debug code you'll find that someone did his job fast

    Change the  Auto Class  to :

     

    class Auto(CommandParser):

     

     

        def processResults(self, cmd, result):

            output = cmd.result.output

            output = output.split('\n')[0].strip()

            exitCode = cmd.result.exitCode

            severity = cmd.severity

            if output.find('|') >= 0:

                msg, values = output.split('|', 1)

            elif CacParser.search(output):

                msg, values = '', output

            else:

                msg, values = output, ''

            msg = msg.strip() or 'Cmd: %s - Code: %s - Msg: %s' % (

                cmd.command, exitCode, getExitMessage(exitCode))

            if exitCode != 0:

                if exitCode == 2:

                    severity = min(severity + 1, 5)

                result.events.append(dict(device=cmd.deviceConfig.device,

                                          summary=msg,

                                          severity=severity,

                                          message=msg,

                                          performanceData=values,

                                          eventKey=cmd.eventKey,

                                          eventClass=cmd.eventClass,

                                          component=cmd.component))

            for value in values.split(' '):

                if value.find('=') > 0:

                    parts = NagParser.match(value)

                else:

                    parts = CacParser.match(value)

                if not parts: continue

                label = parts.group(1).replace("''", "'")

                try:

                    value = float(parts.group(3))

                except:

                    value = 'U'

                for dp in cmd.points:

                    if dp.id == label:

                        result.values.append( (dp, value) )

                        break

     

     

    UPDATE:

    You can also change NagParser value to

    NagParser = re.compile(r"""([^ =']+|'(.*)'+)=([-0-9.eE]+)([^; ]*;?){0,5}""")

    And don't change Auto class , BTW  I think changing class's code is safer because I dunno what's the NagParser(it might be nagios parse !) and I dunno what's it's pattern, I've just changed it to match the return value with regex

1 2 Previous Next

More Like This

  • Retrieving data ...

Legend

  • Correct Answers - 4 points
  • Helpful Answers - 2 points