Nov 18, 2009 2:34 PM
Need some help with dependencies
-
Like (0)
The admin guide only has this word in it a couple of times, and it doesn't have much detail. I guess it is supposed to be automatic. For some reason, it isn't for me. My Zenoss server is on a class c at our office, and it is monitoring a router, serer, and at least 1 WAP at 110 remote locations. All remote connections are MPLS. When the router goes down, I get alerts for all the devices going down. Zenoss is definitely polling SNMP on the router and server (not the WAPs yet). It is polling the serial side of the router, but it sees the Ethernet interface as well in the OS tab. The Ethernet interface on the router is the default gateway on the servers and the WAPs. I think the email alerts I set up are pretty basic. It went with most of the defaults. I didn't see anything related to dependency in the alert setup. Am I missing something, or can someone point me in the right direction? Should Zenoss be ok with the fact that it monitors the serial interface but the devices are connected to the Ethernet interface. If it isn't ok with that, can I change a setting somewhere to set it manually?
**I just added a screenshot of the last 6 alerts I got. The IP for the router in the alert is the serial interface.
Thanks,
Matthew
You need to make sure the ping topology generated by ZenPing is correct.
Also, check that Zenoss can get the relevant routing tables all the way
between your Zenoss server and the devices.
See the Admin guide:
The forums and wiki have more information on testing the ping topology tree.
--
James Pulver
Information Technology Area Supervisor
LEPP Computer Group
Cornell University
Matthew Kitchin (public) wrote, On 11/18/2009 2:29 PM:
The admin guide only has this word in it a couple of times, and it doesn't have much detail. I guess it is supposed to be automatic. For some reason, it isn't for me. My Zenoss server is on a class c at our office, and it is monitoring a router, serer, and at least 1 WAP at 110 remote locations. All remote connections are MPLS. When the router goes down, I get alerts for all the devices going down. Zenoss is definitely polling SNMP on the router and server (not the WAPs yet). It is polling the serial side of the router, but it sees the Ethernet interface as well in the OS tab. The Ethernet interface on the router is the default gateway on the servers and the WAPs. I think the email alerts I set up are pretty basic. It went with most of the defaults. I didn't see anything related to dependency in the alert setup. Am I missing something, or can someone point me in the right direction?
Thanks,
Matthew
>
You have just run into a pretty serious shortcoming in Zenoss, though it seems to be a common shortcoming with most open-source monitoring solutions. Zenoss can supress "downstream" events (for devices that are beyond the network failure point) only if Zenoss has already collected complete routing tables from all devices between Zenoss and the farthest devices. You stated you have an MPLS network between your Zenoss server site and your remote sites. I presume you can't get routing tables from the MPLS network because it is run by a provider and you don't administer it and thus can't get access to it. Even if you did run the MPLS network, Zenoss will not pick up all the routing tables from the MPLS routers (assuming you have MPLS VPNs configured in the MPLS network).
We routinely just deal with large numbers of "false" events when there is a network failure close to the Zenoss server. There is at least one manual device dependency solution out there in the community, but it seems complex enough that I don't want to get into it. So I hope for a robust solution from the Zenoss team.
What Zenoss needs is Layer 2 topology knowledge and needs to suppress events based on layer 2 topology (this is a big deal in a large ethernet switch network). If Zenoss had this capability, it would learn the Layer 2 connectivity in the network devices you manage and you could hopefully represent the provider's MPLS network as maybe a plain router and add false Layer 2 connections that connect to your remote sites. Then, Zenoss could effectively suppress "downstream" events.
Zenoss team?
It seems to have collected all the appropriate routing information. Our network does not change often, so I don’t know if that would be an issue. I'm far from a network engineer. My network engineer says: "Only that I'm not sure why this would apply, since we are just talking (mainly) about devices on a lan behind a router. This isn't a case where we need to worry about 5 hops from point a to point b with multiple pathways to get there"
I had no idea manual dependencies weren't an option. We setup our dependencies manually in our previous product, so I would be fine with doing that, but I guess that isn't an option either. I have searched though a ton of documents and haven't found anything that tells me exactly how to test is Zenoss does know the correct routing information. Can anyone tell me what to do to test if it has the correct routing/dependency info for a particular device?
Thanks,
Matthew
Why Zenoss doesnt copy from Nagios?
I think that Nagios is this case is absolutely superior. You can manually configure your hiearchy (parent & child relationship) and, in this case, the problem is solved. Nothing of complicate, just the annoyance of manually configure dependencies.
Moreover Nagios has anothe GREAT FUTURE: Service dependencies.
Honestly I don't understand why Zenoss doesn't implement these futures. Yes..I know that there are lot of things to be created and that deployment requires time and money and that there are priorities but I think that without these future zenoss will not be a REAL enterprise solution.
I like zenoss and I use it but strictly speaking about "real core futures", I think that these futures are absolutely compulsory.
You have said it well, "the annoyance of manually configuring dependencies in Nagios" If you can put up with that, then why not do the same thing as well in Zenoss? Nagios is a subset of Zenoss, whilist Zenoss implements auto discovered dependencies, it allows you to manually set abitrary dependencies, you just need a little little bit of python skill.
Yes, if you look at the Community FAQ, and or the wiki, you can see the
snippet of python code to use in event transforms to manually create
device dependencies.
I also strongly support layer 2 mapping and dependencies.
--
James Pulver
Information Technology Area Supervisor
LEPP Computer Group
Cornell University
chitambira wrote, On 11/19/2009 6:21 AM:
You have said it well, "the annoyance of manually configuring dependencies in Nagios" If you can put up with that, then why not do the same thing as well in Zenoss? Nagios is a subset of Zenoss, whilist Zenoss implements auto discovered dependencies, it allows you to manually set you abitrary dependencies manually, you just need a little little little little bit of python skill.
I wanted to make it clear that I don't care about mapping, meaning I don't want or care if Zenoss can draw a network map. The current mapping capability is a unusable (it's too slow and stuff moves around too much). I just want Zenoss to learn topology dependencies so that it can suppress status events for devices that are on the far side of a network failure.
I also re-read the original post in this thread and I think I misunderstood the original problem. If Zenoss has grabbed routing tables for a router at site #3 and knows that servers are on subnets either attached to the router or on subnet routable by routers at the site, the current event suppression should work fine.
Matt
I've searched looking for the mentioned methods of adding a manual dependency, but all I have found is this:
that isn't really practical to do for all my devices.
My network isn't complicated or frequently changing. It sounds like Zenoss shouldn't have a problem with the automatic method for me. I would love some help in troubleshooting this. My network map is a blank page. I don;t know if that is indicative of a larger problem. I have looked through zenping.log, and all I see are the lines where it mentions devices being down. I haven't seen anything related to topology or any info that seems to indicate there is or isn't a problem with it obtianing routing information. Can someone tell me the best way to determine if it is or isn't getting the information it needs to try and handle dependencies properly?
Thanks for all the help.
-Matthew
Unfortunately, that's it for layer 2. (So if a switch goes down, you get
alerts on everything behind it).
--
James Pulver
Information Technology Area Supervisor
LEPP Computer Group
Cornell University
Matthew Kitchin (public) wrote, On 11/19/2009 11:14 AM:
I've searched looking for the mentioned methods of adding a manual dependency, but all I have found is this:
that isn't really practical to do for all my devices.
My network isn't complicated or frequently changing. It sounds like Zenoss shouldn't have a problem with the automatic method for me. I would love some help in troubleshooting this. My network map is a blank page. I don;t know if that is indicative of a larger problem. I have looked through zenping.log, and all I see are the lines where it mentions devices being down. I haven't seen anything related to topology or any info that seems to indicate there is or isn't a problem with it obtianing routing information. Can someone tell me the best way to determine if it is or isn't getting the information it needs to try and handle dependencies properly?
Thanks for all the help.
-Matthew
>
Zenoss has indeed pulled the routes from the routers at our remote sites. All we want is to supress the alerts pertaining to LAN hosts that are directly connected to a router that is down. So you are saying this should work, even though we don't know all the routes between sites (due to MPLS)?
mwoodling wrote:
I also re-read the original post in this thread and I think I misunderstood the original problem. If Zenoss has grabbed routing tables for a router at site #3 and knows that servers are on subnets either attached to the router or on subnet routable by routers at the site, the current event suppression should work fine.
Matt
No, you need the entire topology between the Zenoss server and the
endpoint for this to work right.
--
James Pulver
Information Technology Area Supervisor
LEPP Computer Group
Cornell University
Matthew Kitchin (public) wrote, On 11/19/2009 12:21 PM:
Zenoss has indeed pulled the routes from the routers at our remote sites. All we want is to supress the alerts pertaining to LAN hosts that are directly connected to a router that is down. So you are saying this should work, even though we don't know all the routes between sites (due to MPLS)?
Let me follow up, I think you can manually create "fake" routes in
Zenoss to work around this, there are some forum posts on that.
--
James Pulver
Information Technology Area Supervisor
LEPP Computer Group
Cornell University
Matthew Kitchin (public) wrote, On 11/19/2009 12:21 PM:
Zenoss has indeed pulled the routes from the routers at our remote sites. All we want is to supress the alerts pertaining to LAN hosts that are directly connected to a router that is down. So you are saying this should work, even though we don't know all the routes between sites (due to MPLS)?
Thanks for the tips. Since we are on a MPLS network, I guess we are out of luck with the automatic discovery. That stinks, but so be it. There isn't a built in way to set manual dependencies, so out of luck thee as well. I have seen several references to posts and wiki entries that I haven't been able to find. It is possible I'm just picking the absolute wrong terms to search for. Can anyone point me towards the sample code to add a manual dependency or the directions for faking routes so we might be able to figure out how to reduce the number of extra alerts we are getting?
Thanks,
Matthew
Follow Us On Twitter »
|
Latest from the Zenoss Blog » | Community | Products | Services Resources | Customers Partners | About Us | ||
Copyright © 2005-2011 Zenoss, Inc.
|
||||||||