Aug 31, 2011 4:15 AM
zendisc runs forever, never attempts to contact remote host
-
Like (0)
I'm trying to discover a device, but when running zendisc, I get the following output:
2011-08-31 18:09:26,375 DEBUG zen.ZenDisc: Run in foreground, starting immediately.
2011-08-31 18:09:29,226 DEBUG zen.ZenDisc: Run in foreground, starting immediately.
2011-08-31 18:09:29,238 DEBUG zen.pysamba: client ntlmv2 auth is now no
2011-08-31 18:09:29,238 DEBUG zen.ZenDisc: Starting PBDaemon initialization
2011-08-31 18:09:29,239 INFO zen.ZenDisc: Connecting to localhost:8789
2011-08-31 18:09:29,241 DEBUG zen.ZenDisc: Logging in as admin
2011-08-31 18:09:29,249 INFO zen.ZenDisc: Connected to ZenHub
2011-08-31 18:09:29,249 DEBUG zen.ZenDisc: Setting up initial services: EventService, DiscoverService
2011-08-31 18:09:29,250 DEBUG zen.ZenDisc: Chaining getInitialServices with d2
2011-08-31 18:09:29,252 DEBUG zen.ZenDisc: Loaded service EventService from zenhub
2011-08-31 18:09:29,252 DEBUG zen.ZenDisc: Loaded service DiscoverService from zenhub
2011-08-31 18:09:29,252 DEBUG zen.ZenDisc: Queueing event {'severity': 0, 'component': 'zendisc', 'agent': 'zendisc', 'summary': 'started', 'manager': 'host.domain.tld', 'device': 'localhost', 'eventClass': '/App/Start', 'monitor': 'localhost'}
2011-08-31 18:09:29,253 DEBUG zen.ZenDisc: Total of 1 queued events
2011-08-31 18:09:29,254 DEBUG zen.ZenDisc: Calling connected.
2011-08-31 18:09:29,254 DEBUG zen.ZenDisc: fetching monitor properties
2011-08-31 18:09:29,267 DEBUG zen.ZenDisc: Getting threshold classes...
2011-08-31 18:09:29,268 DEBUG zen.ZenDisc: Loading classes ['Products.ZenModel.MinMaxThreshold']
2011-08-31 18:09:29,269 DEBUG zen.ZenDisc: Fetching default RRDCreateCommand...
2011-08-31 18:09:29,270 DEBUG zen.ZenDisc: Getting collector thresholds...
2011-08-31 18:09:29,279 DEBUG zen.thresholds: Updating threshold ('high event queue', ('localhost', ''))
2011-08-31 18:09:29,279 DEBUG zen.thresholds: Updating threshold ('zenmodeler cycle time', ('localhost', ''))
2011-08-31 18:09:29,279 DEBUG zen.thresholds: Updating threshold ('zenperfsnmp cycle time', ('localhost', ''))
2011-08-31 18:09:29,279 DEBUG zen.thresholds: Updating threshold ('zenping cycle time', ('localhost', ''))
2011-08-31 18:09:29,279 DEBUG zen.thresholds: Updating threshold ('zenprocess cycle time', ('localhost', ''))
2011-08-31 18:09:29,280 DEBUG zen.ZenDisc: Getting collector plugins for each DeviceClass
2011-08-31 18:09:29,325 INFO zen.ZenDisc: Looking for nnn.nnn.nnn.nnn
2011-08-31 18:09:29,326 DEBUG zen.ZenDisc: Found IP nnn.nnn.nnn.nnn for device host.domain.tld
2011-08-31 18:09:29,328 DEBUG zen.ZenDisc: Scanning device with address nnn.nnn.nnn.nnn
2011-08-31 18:09:29,328 DEBUG zen.ZenDisc: Doing SNMP lookup on device nnn.nnn.nnn.nnn
2011-08-31 18:09:34,288 DEBUG zen.Ping: unexpected pkt xxx.xxx.xxx.xxx <ICMP packet 0 0>
2011-08-31 18:09:35,301 DEBUG zen.Ping: unexpected pkt yyy.yyy.yyy.yyy <ICMP packet 0 0>
2011-08-31 18:09:36,301 DEBUG zen.Ping: unexpected pkt xxx.xxx.xxx.xxx <ICMP packet 0 0>
.....
And that's it - the zen.Ping messages keep rolling in as pings come back from those hosts, but nothing more happens on this thread, and dumping traffic from the monitoring host to the remote host shows that absolutely no attempt to contact the remote side is made. I'm really stumped and would appreciate any suggestions - the Python code is a little beyond my familiarity at that point.
I haven't personally seen this, but have heard reports of some firewalls not liking the Zenoss custom created ping packets. They're not the same as the OS created ones, and the various test pings are misleading . . .
If you delete the zendisc job, start a packet caputure and re-add the job, you really never see *any* packets from zenoss to the remote device?
--
James Pulver
Information Technology Area Supervisor
LEPP Computer Group
Cornell University
I tried earlier without any firewall, but there are absolutely no packets sent from zenoss to the remote device - that's what has me stumped.
sPOiDar:
Do you have Netscreens or SSGs along the path?
Dont worry about unexpected packet notices- this only means Zenoss recieved an unexpected icmp packet. This occurs when someone or something is pinging the Zenoss server.
Best,
--Hackman238
Yeah, I wasn't concerned about the ping debugs, just included them for completeness - everything pretty much just stops after 'Doing SNMP lookup on device'.
To answer your question, I can't be certain - the Zenoss host and host to be monitored are Xen VMs hosted with a third party - there is a router of some sort between them, but I don't know what it is.
With that said, intermediary devices must be irrelevant at this point, since no traffic whatsoever leaves the Zenoss host destined for the remote host.
I traced into the execution to zendisc.py line 624-626:
yield self.discoverDevice(ip,
devicepath=self.options.deviceclass,
prodState=self.options.productionState)
In which it fires the job off, then sits in WmiEPollReactor for ever, never receiving the response...
sPOiDar:
I've never seen a situation where zendisc would not acctually execute its tests. What version of Zenoss are you running? Whats the patch level? Have you made any code changes by hand? What means did you use to determine zendisc isn't sending out and traffic?
Best,
--Hackman238
I'm using the Zenoss stack package on Ubuntu Server 10.04.3 (up to date as of today), Zenoss 3.1.0-0. No local code changes.
System has a single interface, used:
tcpdump -n -i eth0 -A -s 0 host nnn.nnn.nnn.nnn
Okay, so if I run zendisc with --no-snmp it adds the device fine, I can then remodel and everything works (including SNMP), so this is definitely specific to zendisc.
It doesn't quite feel like I've got enough to file a bug report with, but this is definitely a problem for us, and one that we need to solve. Anyone have any pointers on where to go from here to get more information?
sPOiDar:
Thats very bizarre. Have you tried running zendisc from the cli in debug mode? This might tell us more. On the master, `su zenoss` then `zendisc run --now -d {yourdevice} --deviceclass {TheDeviceClassYouWant} -v10`
Best,
--Hackman238
Yeah, that's how I got the original log, and how I tested with --no-snmp (and a debugger to find where it's looping) :-/
sPOiDar:
Gotcha. What I might reccomend is to stop zenoss, clear out your logs (save them if you like), then
`su zenoss`
`zeoctl start`
`zopectl fg` - make note of any failed modules, etc
On another console-
`su zenoss`
`zenhub start -v10`
`zenmodeler start -v10`
Finally, try running zendisc again.
If anything suspicious is in any of the newly created logs then tar them up after sanitising ip's, etc and post them.
sPOiDar,
Is this happening against any box you try to discover? If it's specific to that box, can you try an snmpwalk against it and tell me what you get back?
Thanks,
Chip Holden
Zenoss engineering
@hackman238 - I've done this, no obvious errors, I'll parse the logs more closely when I'm awake to make sure but it all looks sane. Behaviour remains the same though.
@mrchippy - It happens for at least the couple that I've tried. Running snmpwalk works as expected, and as I say - zendisc doesn't generate any traffic whatsoever to the host. As I said earlier, if I manually discover with --no-snmp, then model the host I get all the SNMP data populated immediately, and everything works from then on.
The problem appears to be that when zendisc fires off discoverDevice it goes into a poll for a response, but whatever process discoverDevice is meant to trigger never reports back, and nothing at all seems to happen.
Follow Us On Twitter »
|
Latest from the Zenoss Blog » | Community | Products | Services Resources | Customers Partners | About Us | ||
Copyright © 2005-2011 Zenoss, Inc.
|
||||||||