Monitoring the status of NFS shares with Zenoss

Currently Being Moderated

VERSION 12

Created on: Jan 21, 2010 4:07 PM by Ryan Matte - Last Modified: Jan 22, 2010 7:10 PM by Ryan Matte

I frequently get asked how one would go about monitoring the status of NFS shares with Zenoss. I recently came up with a solution.

For starters, you may have to remove "networkDisk" from zFileSystemMapIgnoreTypes in zProperties and remodel your devices (if Zenoss hasn't picked up the NFS shares).

You will then need to navigate to /Events/Perf/Snmp and select More -> Transform from the dropdown. Insert the following transform and click save.

import re
if re.search('1.3.6.1.2.1.25.2.3.1.6', evt.summary) and evt.severity == 1:
 p = re.compile('\"(.*)\"')
 m = p.search(evt.summary)
 share = m.group(1)
 if share:
  d = dmd.Devices.findDevice(evt.device)
  for f in d.os.filesystems():
   if re.match(f.mount, share) and re.match('networkDisk', f.type):
    evt.summary = 'NFS share %s has become unavailable.' % (share)
    evt.message = 'NFS share %s has become unavailable.' % (share)
    evt.eventClass = '/Perf/Filesystem'
    evt.component = share
    evt.severity = 5
    f.lockFromDeletion()
    d.pushConfig()
    txnCommit()

This transform gets applied to any debug events which come in with 1.3.6.1.2.1.25.2.3.1.6 as part of the summary message. This is the OID used in the filesystems template. It then extracts the name of the filesystem from the summary. Once it has the name of the filesystem object it checks to make sure that it's an NFS share (it checks to see whether or not the type is "networkDisk").

Once it has confirmed that it is an NFS share it moves on to setting the event summary and message. It also sets the eventClass to "/Perf/Filesystem" which is more fitting. It sets the event component to the name of the share. It sets the event severity to Critical. It then locks the share from deletion in Zenoss (in case an automated remodel kicks off while the share is unavailable).

Lastly it pushes changes to the collector (the same as going to Manage -> Push Changes from the device page). Pushing changes is done because when a debug event comes in for a component, Zenoss stops monitoring the component until it detects a change in configuration (which is accomplished by pushing changes to the collector). This ensures that Zenoss will continue to generate an alert for the share every polling cycle until it is available again. It also ensures that monitoring of the share will take place after it has been restored. It will take 15 to 20 minutes for performance data to start showing up again on the utilization graph once it has been restored.

Like (0)

39518 Views

Comments (11)

Monitoring the status of NFS shares with Zenoss

Actions

More Like This

Incoming Links

More by Ryan Matte