Archived community.zenoss.org | full text search
Skip navigation
43992 Views 5 Replies Latest reply: Nov 24, 2009 10:39 AM by chitambira RSS
Falk Rank: White Belt 89 posts since
Jul 27, 2007
Currently Being Moderated

Jul 8, 2008 3:51 AM

Problem with snmp and faulty size reporting.

Hi,

When monitoring fs over snmp i get faulty redings for all linux servers.
I guess that it is a problem with my snmpd conf.

here it is.

On the server:

 Mount   Total bytes   Used bytes   Free bytes   % Util        
 /          9.4GB          8.8GB          565.5MB      94


and in zenoss:

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             9.4G  8.9G   78M 100% /
tmpfs                 253M     0  253M   0% /lib/init/rw
udev                   10M   52K   10M   1% /dev
tmpfs                 253M     0  253M   0% /dev/shm


Any ideas why this is?
Is it tmpfs that take the resting space and isn't reported on / by snmpd?

--
Regards Folke
  • beanfield Rank: Green Belt 161 posts since
    Apr 16, 2008
    Currently Being Moderated
    1. Jul 8, 2008 1:26 PM (in response to Falk)
    RE: Problem with snmp and faulty size reporting.
    I get similar results. For instance:

     

     

    # df -h
    Filesystem Size Used Avail Use% Mounted on
    /dev/xvda3 61G 5.5G 52G 10% /



    and in zenoss:

     

     

    Mount Total bytes Used bytes Free bytes % Util Lock
    / 60.4GB 5.4GB 54.9GB 8



    However, if I use "df" without the -h option (human readable), I get the following:

     

     

    # df
    Filesystem 1K-blocks Used Available Use% Mounted on
    /dev/xvda3 63282372 5665584 54402184 10% /



    So I have 5,665,584 kbytes of used space. If I take the Used 1k blocks 5665584 and divide by 1024 twice I get 5.4031219482421875. So even though size available should be 5.4G, df -h option rounds up to 5.5G. To answer your question directly, I believe it is the -h option on df that's rounding incorrectly. It is confirmed by df without using the -h option and below. You can see what snmp is returning by doing an snmpwalk on hrStorageTable.

    snmpwalk -v2c -c COMMUNITY HOSTNAME hrStorageSize

    You can get the snmp index number of the volume that you're trying to find the size on by clicking on it under File Systems under the OS tab. For example, the system above is my zenoss server and I'll query it using "localhost" for the HOSTNAME and public for the COMMUNITY. The Snmp Index is 4.

     

     

    # snmpwalk -v2c -c public localhost hrStorageTable
    HOST-RESOURCES-MIB::hrStorageIndex.1 = INTEGER: 1
    HOST-RESOURCES-MIB::hrStorageIndex.2 = INTEGER: 2
    HOST-RESOURCES-MIB::hrStorageIndex.3 = INTEGER: 3
    HOST-RESOURCES-MIB::hrStorageIndex.4 = INTEGER: 4
    HOST-RESOURCES-MIB::hrStorageIndex.5 = INTEGER: 5
    HOST-RESOURCES-MIB::hrStorageIndex.6 = INTEGER: 6
    HOST-RESOURCES-MIB::hrStorageIndex.7 = INTEGER: 7
    HOST-RESOURCES-MIB::hrStorageType.1 = OID: HOST-RESOURCES-TYPES::hrStorageOther
    HOST-RESOURCES-MIB::hrStorageType.2 = OID: HOST-RESOURCES-TYPES::hrStorageRam
    HOST-RESOURCES-MIB::hrStorageType.3 = OID: HOST-RESOURCES-TYPES::hrStorageVirtualMemory
    HOST-RESOURCES-MIB::hrStorageType.4 = OID: HOST-RESOURCES-TYPES::hrStorageFixedDisk
    HOST-RESOURCES-MIB::hrStorageType.5 = OID: HOST-RESOURCES-TYPES::hrStorageFixedDisk
    HOST-RESOURCES-MIB::hrStorageType.6 = OID: HOST-RESOURCES-TYPES::hrStorageFixedDisk
    HOST-RESOURCES-MIB::hrStorageType.7 = OID: HOST-RESOURCES-TYPES::hrStorageFixedDisk
    HOST-RESOURCES-MIB::hrStorageDescr.1 = STRING: Memory Buffers
    HOST-RESOURCES-MIB::hrStorageDescr.2 = STRING: Real Memory
    HOST-RESOURCES-MIB::hrStorageDescr.3 = STRING: Swap Space
    HOST-RESOURCES-MIB::hrStorageDescr.4 = STRING: /
    HOST-RESOURCES-MIB::hrStorageDescr.5 = STRING: /sys
    HOST-RESOURCES-MIB::hrStorageDescr.6 = STRING: /sys/kernel/debug
    HOST-RESOURCES-MIB::hrStorageDescr.7 = STRING: /boot
    HOST-RESOURCES-MIB::hrStorageAllocationUnits.1 = INTEGER: 1024 Bytes
    HOST-RESOURCES-MIB::hrStorageAllocationUnits.2 = INTEGER: 1024 Bytes
    HOST-RESOURCES-MIB::hrStorageAllocationUnits.3 = INTEGER: 1024 Bytes
    HOST-RESOURCES-MIB::hrStorageAllocationUnits.4 = INTEGER: 4096 Bytes
    HOST-RESOURCES-MIB::hrStorageAllocationUnits.5 = INTEGER: 4096 Bytes
    HOST-RESOURCES-MIB::hrStorageAllocationUnits.6 = INTEGER: 4096 Bytes
    HOST-RESOURCES-MIB::hrStorageAllocationUnits.7 = INTEGER: 1024 Bytes
    HOST-RESOURCES-MIB::hrStorageSize.1 = INTEGER: 1048740
    HOST-RESOURCES-MIB::hrStorageSize.2 = INTEGER: 1048740
    HOST-RESOURCES-MIB::hrStorageSize.3 = INTEGER: 1052248
    HOST-RESOURCES-MIB::hrStorageSize.4 = INTEGER: 15820593
    HOST-RESOURCES-MIB::hrStorageSize.5 = INTEGER: 0
    HOST-RESOURCES-MIB::hrStorageSize.6 = INTEGER: 0
    HOST-RESOURCES-MIB::hrStorageSize.7 = INTEGER: 101086
    HOST-RESOURCES-MIB::hrStorageUsed.1 = INTEGER: 19500
    HOST-RESOURCES-MIB::hrStorageUsed.2 = INTEGER: 1039048
    HOST-RESOURCES-MIB::hrStorageUsed.3 = INTEGER: 922588
    HOST-RESOURCES-MIB::hrStorageUsed.4 = INTEGER: 1416399
    HOST-RESOURCES-MIB::hrStorageUsed.5 = INTEGER: 0
    HOST-RESOURCES-MIB::hrStorageUsed.6 = INTEGER: 0
    HOST-RESOURCES-MIB::hrStorageUsed.7 = INTEGER: 17733



    I can see that hrStorageDescr.4 is "/" so I know I'm querying the correct volume. My hrStorageAllocationUnits.4 is 4096 Bytes which is my block size. hrStorageSize.4 is 15820593...and multiplying that by 4096 I get 64801148928. That's the total size of my disk in bytes. hrStorageUsed.4 is 1416399 and multiplying that by 4096 I get 5801570304. That's the total size of used space in bytes. Note: if you wanted to get free space you'd have to subtract the size of the disk by the size of used space as there is no oid in this mib for available space.

    Anyway, my used bytes is 5801570304, and if I divide this by 1024 I get the same used 1k blocks from df (without the -h) as above 5665596.

    Again, that's all if I understand this correctly. happy
  • beanfield Rank: Green Belt 161 posts since
    Apr 16, 2008
    Currently Being Moderated
    3. Jul 14, 2008 10:39 AM (in response to Falk)
    RE: Problem with snmp and faulty size reporting.

    "folke" wrote:

     

     

    "beanfield" wrote:

     


    Again, that's all if I understand this correctly. :)



    The thing is that the disk is full before it sends out an alert.
    I have put that value to 98% ( :oops: ) so when the small disks shows the wrong value things go bad :p

    Off course i can change the value for critical, but it would be better if the graphs where correct..

    Did you change anything in you graph template so that it showed the correct value?

    --
    Regards Folke



    hmmm....when you say "the disk is full before it sends out an alert", how are you finding out that the disk is full? Are you getting errors when you try and write to the disk in your terminal...and more accurately are you getting errors in /var/log/messages to the effect "no space left on device"?

    The only reason I ask is because some utilities/scripts that watch disk space actually just do a "df -h", which as pointed out before will incorrectly round up and say the disk is full when it still may have some space available. For instance, logwatch does this (at least on debian it does). I installed logwatch and looked at the script that calculates free space /usr/share/logwatch/scripts/services/zz-disk_space:
    
       if ($OSname eq "Linux") {
          $df_options = "-h -l -x tmpfs";


    So if you're basing the disk status off of something like logwatch....it could be incorrectly reporting that the disk is full. Can you provide the following output from a server that you know has the issue where it filled the disk before zenoss alerted?

     

     

    df -h
    df
    snmpwalk -v2c -c COMMUNITY_STRING HOSTNAME 1.3.6.1.2.1.25.2.3



    Be sure and replace "COMMUNITY_STRING" and "HOSTNAME" appropriately.
  • chitambira Rank: Brown Belt 711 posts since
    Oct 15, 2008
    Currently Being Moderated
    5. Nov 24, 2009 10:39 AM (in response to Falk)
    Re: RE: Problem with snmp and faulty size reporting.

    This is a very old thread, but I thot someone might stumble here when searching for a similar problem. 

    The issue with linux filesystems is the filesystem offset, whcih is not caounted for by default in zenoss, but df counts for it.

    If you look at the two cases given above, you can see that total and used are both ok, but the free is the culprit algorithm to calculate free space should be:

     

    free = (total*offset) - used

     

    offset is normarly 0.05 (or 5%) for linux filesystems, which means 5% of the filesystem is reserved

    docs/DOC-3233

More Like This

  • Retrieving data ...