Archived community.zenoss.org | full text search
Skip navigation
Currently Being Moderated

Ping Template and Command Data Source Walkthrough

VERSION 1 
Created on: Jun 7, 2010 4:00 PM by Matt Ray - Last Modified:  Jun 7, 2010 4:03 PM by Matt Ray

From a session on IRC:

 

1:51:02 PM arrrghhh: hey all.  trying to get zenoss to display metrics on ping response times for devices, and i'm hitting a brick wall.  has anyone done this before?
1:53:25 PM mray: arrrghhh: what do you want?  Just a ping graph?
1:53:58 PM arrrghhh: mray, yes, our old network monitoring system has a response time graph that helps us pinpoint outages.
1:54:12 PM arrrghhh: thread/5455?start=0&tstart=0 - that thread seems like what i want, but i can't figure out how to get it working.
1:54:14 PM kobalt: there is a check_ping plugin that will tell you latency for the pings, but from my testing its no responisve enought in large networks
1:54:30 PM arrrghhh: check_ping, yes, that's what i was going to use...
1:54:44 PM kobalt: s/no/not
1:54:44 PM arrrghhh: we don't monitor an enormous number of devices by any means, it's around 500.
1:54:47 PM mray: kobalt: too much overhead running that many commands?
1:55:08 PM kobalt: mray, it takes about 20 seconds to run the command
1:55:15 PM mray: kobalt: per device?
1:55:21 PM kobalt: mray yes
1:55:26 PM mray: wow, that blows
1:55:49 PM arrrghhh: hrm.  i wonder how nmis does it... probably something similar.  nmis is far from elegant.  our network monitoring department is just used to it, and i know they're going to complain if they don't have something similar with zenoss.
1:55:56 PM mray: well… there is a half-finished example somewhere for doing it with Twisted
1:56:16 PM kobalt: it could be my setup but I could not get it under about 9 seconds for the localhost (pinging the server that zenoss is on)
1:56:39 PM arrrghhh: so this isn't something that's built-into zenoss?  seems like it would be something that is pretty desirable for routers...
1:56:46 PM arrrghhh: for monitoring routers that is.
1:56:53 PM mray: kobalt: the Dev Guide 11.4 is a start on a new PingPerf collector
1:57:25 PM mray: arrrghhh: we just collect ping availability, not the performance
1:58:00 PM mray: but yeah, it's one of those things people ask for that we never get around to doing
1:58:11 PM mray: since fPing and check_ping work fairly well
1:58:24 PM arrrghhh: mray, well i guess i'd like ping availability graphed...
1:58:38 PM mray: arrrghhh: I can walk you through it, it's pretty simple
1:58:53 PM arrrghhh: if i'm already pinging the device with zenoss, can't i tie that to a graph point?
1:58:57 PM arrrghhh: alrighty
1:59:48 PM mray: arrrghhh: have you looked at docs/DOC-3416 ?
2:00:06 PM mray: specifically the attached PDF?
2:00:45 PM mray: all you need is a custom template with a command data source
2:01:03 PM arrrghhh: i'll take a look
2:01:08 PM arrrghhh: i haven't looked at it.
2:01:43 PM mray: arrrghhh: I'll walk you through it, it's really simple
2:01:47 PM mray: go to Devices
2:01:58 PM arrrghhh: k
2:01:59 PM mray: http://yourserver:8080/zport/dmd/Devices
2:02:15 PM mray: Templates->Add Template
2:02:22 PM mray: call it "PingPerf"
2:02:38 PM arrrghhh: alrighty
2:02:42 PM mray: Add a Command Data Source
2:02:57 PM mray: call it Ping
2:03:41 PM arrrghhh: k
2:06:14 PM mray: hmm… lemme get the path to check_ping right
2:06:24 PM arrrghhh: i think i have it in mine...
2:06:59 PM arrrghhh: /opt/zenoss/common/libexec/check_ping -H $devname -w 180,100% -c300,100% |sed -e 's# - Packet loss#|loss#;s#%,##;s#ms##;s# = #g'
2:07:08 PM arrrghhh: that's the command i have for my 'test' setup.
2:07:14 PM mray:  /usr/local/zenoss/common/libexec/check_ping -H ${here/manageIp} -w 180,100% -c 300,100% | sed -e 's# - Packet loss#|loss#;s#%,##;s#ms##;s# = #=#g'
2:07:28 PM mray: note that I replaced $devname with ${here/manageIp}
2:07:39 PM mray: in case DNS fails
2:07:39 PM arrrghhh: is that better than $devname?
2:07:43 PM arrrghhh: ah
2:07:44 PM mray: yeah
2:07:45 PM arrrghhh: i see
2:08:00 PM mray: I don't have DNS consistently setup with some boxes
2:08:07 PM arrrghhh: alrighty, what's next.
2:08:13 PM mray: save that, then test it against a known device
2:08:35 PM mray: Executing command /usr/local/zenoss/common/libexec/check_ping -H 10.87.209.111 -w 180,100% -c 300,100% | sed -e 's# - Packet loss#|loss#;s#%,##;s#ms##;s# = #=#g' against 10.87.209.111 PING OK|loss=0 RTA=0.05 |rta=0.046000ms;180.000000;300.000000;0.000000 pl=0%;100;100;0 DONE in 4 seconds
2:08:41 PM arrrghhh: yep.
2:08:46 PM arrrghhh: done in 4 secs as well.
2:08:57 PM mray: go back to the PingPerf tempalte
2:09:01 PM arrrghhh: k
2:09:13 PM mray: oops, I forgot the data points
2:09:22 PM mray: back to the data source
2:09:35 PM arrrghhh: alright
2:09:39 PM mray: data points are the values returned by your script
2:09:50 PM mray: "loss" and "RTA"
2:09:51 PM arrrghhh: LOSS & RTA?
2:09:55 PM arrrghhh: excellent.
2:10:16 PM arrrghhh: what do i configure for them tho?
2:10:35 PM mray: defaults
2:10:44 PM mray: they're both gauges
2:10:52 PM arrrghhh: so no rrd min/max & nothing for 'create cmd'?
2:11:00 PM mray: nah, those are dynamic
2:11:05 PM arrrghhh: alrighty.
2:11:08 PM mray: now you can add a threshold
2:11:11 PM arrrghhh: k
2:11:19 PM mray: "loss"
2:11:31 PM mray: select Ping_loss for your Data Point
2:11:41 PM mray: perhaps 5 for max value
2:11:57 PM mray: Event Class of /Status/Ping makes sense
2:12:03 PM arrrghhh: ok
2:12:12 PM arrrghhh: then do a second for RTA as well?
2:12:18 PM mray: yup
2:12:38 PM arrrghhh: alrighty
2:12:59 PM mray: Add a graph
2:13:08 PM mray: "Ping Perf"
2:13:22 PM arrrghhh: defaults for it as well?
2:13:34 PM mray: yeah
2:13:39 PM arrrghhh: i guess i should put Milliseconds for units
2:13:49 PM mray: arrrghhh: you could make 2 graphs, 1 for loss and 1 for RTA
2:14:05 PM arrrghhh: can i have them both in the same graph?
2:14:15 PM MWT: Hi all, does anyone have time for a little tutoring?
2:14:31 PM mray: arrrghhh: well, you could but you'd be comparing ms with counts
2:14:45 PM arrrghhh: MWT, go ahead and ask away!
2:14:49 PM mray: MWT: I'm finishing up a Command Data Source walkthrough
2:14:59 PM arrrghhh: mray, ah, i see... yea probably would be better to have 2 graphs then.
2:15:02 PM MWT: cool, ill follow along
2:15:20 PM mray: arrrghhh: you may want to clean up the descriptions a bit
2:15:35 PM kobalt: mray: I made the change you suggested on my setup and it went from 19 sec to 4 hrmm... dns issue I think
2:15:50 PM mray: kobalt: yeah
2:16:04 PM mray: arrrghhh: ie. change the threshold name to "Max Loss" or somesuch
2:16:18 PM mray: you can also play with the colors, I like red
2:16:50 PM mray: E90000
2:17:16 PM arrrghhh: okie
2:17:32 PM arrrghhh: i have graphs now on this device, but they're blank... shouldn't i have at least one data point?
2:17:48 PM mray: ahh Grasshopper, have you bound your new Template?
2:17:58 PM arrrghhh: i believe so
2:18:09 PM arrrghhh: to one device
2:18:40 PM mray: lemme go see what happens with 1 device
2:18:42 PM arrrghhh: i just went to an individual device, went to more->templates and bind template
2:19:19 PM mray: and you control-clicked?
2:19:21 PM mray: mine works
2:19:48 PM arrrghhh: control-clicked?
2:20:07 PM mray: More->Templates
2:20:08 PM arrrghhh: the graph for Ping Perf and RTA perf show
2:20:14 PM arrrghhh: i just don't get any data points.
2:20:21 PM mray: arrrghhh: patience
2:20:24 PM mray: 5 minutes
2:20:33 PM arrrghhh: ah, so i do need to let it run.
2:20:37 PM mray: the RRD file is getting filled
2:20:46 PM arrrghhh: that's fine, just wanted to make sure.  i figured i would get one data point right off the bat
2:21:03 PM mray: nope
2:21:07 PM arrrghhh: alrighty
2:21:11 PM mray: you should now have 2 graphs on your special device
2:21:18 PM arrrghhh: i do
2:21:20 PM mray: you could go and bind the template wherever you need it
2:21:28 PM mray: and check_ping will run against those devices
2:21:33 PM arrrghhh: k
2:21:41 PM arrrghhh: i'm assuming i can associate the template to an entire device group?
2:21:46 PM mray: yes
2:21:48 PM mray: you could also go Create a New ZenPack and export this
2:22:00 PM mray: you could call it Nagios Check Ping, but that ZenPack is already taken
2:22:23 PM arrrghhh: haha
2:22:47 PM arrrghhh: sweet.
2:23:03 PM arrrghhh: alrighty, it's all bound.  now consider adding this as something built-into zenoss!
2:23:21 PM mray: yeah, it's on the long list of "duh" things we should add
2:25:19 PM arrrghhh: holy crap.
2:25:51 PM arrrghhh: i think i need to adjust my thresholds.  i have 180+ events for threshold of RTA exceeded.
2:27:08 PM mray: arrrghhh: did you set it at 1 or something?
2:27:18 PM arrrghhh: mray 30.  i just set it to 200.
2:28:47 PM arrrghhh: mray, thanks for the help.  going to lunch, hopefully i have a lot of pretty graphs when i get back
2:34:11 PM mray: my graphs are populating now

Comments (0)