Running processes appear as being down - what am I doing wrong? - Open Source Network Monitoring and Systems Management

Archived community.zenoss.org | full text search

Skip navigation

Up to Discussions in zenoss-users

3514 Views 5 Replies Latest reply: Dec 1, 2011 7:34 AM by tgdaero9

sec

3 posts since
Oct 14, 2011

Currently Being Moderated

Oct 18, 2011 10:52 AM

Running processes appear as being down - what am I doing wrong?

Hi all, Zenoss newbie here. I have a question for you guys about the behavior of zenprocess on Zenoss 3.2.0.

I'm using Zenoss to monitor some Oracle Database instances. I have a list of processes to monitor for each database - for each instance, they are all of the form:

ora_pmon_<instance name>

ora_smon_<instance name>

ora_ckpt_<instance name>

ora_lgwr_<instance name>

ora_dbw0_<instance name>

and so on. I came up with regexes to match these processes, so that I can use zenprocess to monitor their status. For example, the regex for ora_pmon_<instance name> is

^[^ ]*_pmon_[^ /].*

After adding the regexes for each process, when I model one of the database servers the result is that all 5 of these processes are picked up for each of 3 database instances and show in the list of "OS Processes". All of them register as being up, which is correct because I know they are all up. But, in a few minutes' time, most of them change their status to 'down'! Only one of each type (pmon, smon, etc) remains up.

Does anyone know what could cause this sort of behavior? Does this mean that I have to make a new process entry for each type of process for each database instance? Any help would be appreciated.

Like (0)

Tags: zenprocess, 3.2.0

jmp242
4,060 posts since
Mar 7, 2007

Currently Being Moderated

1. Oct 18, 2011 11:51 AM (in response to sec)
Re: Running processes appear as being down - what am I doing wrong?

This may be a bug. Zenoss wasn't able to reproduce, but if you can add / re-open the ticket and work with them, they may be able to work out what's causing this.
See

http://dev.zenoss.org/trac/ticket/7870

--
James Pulver
Information Technology Area Supervisor
LEPP Computer Group
Cornell University

Report Abuse

Like (0)
sec
3 posts since
Oct 14, 2011

Currently Being Moderated

2. Oct 18, 2011 2:49 PM (in response to jmp242)
Re: Running processes appear as being down - what am I doing wrong?

The bug submitted in that report is very similar to my problem. The only difference is that my installation of Zenoss doesn't show any one process running more than once. They might just be related, though!

Unfortunately, my setup might not help them replicate the issue - I compiled Zenoss from source on a RHEL 6 platform (which according to the release notes for 3.2.0 is not supported yet), rather than using a standardized appliance or stack installer. Is it worth reopening the issue anyway?

Report Abuse

Like (0)
Shane Scott
1,373 posts since
Jul 6, 2009

Currently Being Moderated

3. Oct 18, 2011 6:53 PM (in response to sec)
Re: Running processes appear as being down - what am I doing wrong?

sec:

I would reopen the ticket if you have time even if the problem is resolved after a custom compile. I'm interested to see if it reoccurs or not in your compiled version. Keep us posted.

Best,
--Shane W. Scott (Hackman238)

Report Abuse

Like (0)
sec
3 posts since
Oct 14, 2011

Currently Being Moderated

4. Oct 19, 2011 11:39 AM (in response to Shane Scott)
Re: Running processes appear as being down - what am I doing wrong?

I have reopened the ticket. I'll update this thread as I find out more.

Report Abuse

Like (0)
tgdaero9
42 posts since
Sep 23, 2008

Currently Being Moderated

5. Dec 1, 2011 7:34 AM (in response to sec)
Re: Running processes appear as being down - what am I doing wrong?

i am facing exactly the same issue. We are on the way to migrate from a productive Zenoss Core v 2.4.5 to current v3.2.1.

Monitoring Oracle Solaris 10 Servers (Global Zones & Non Global Zones/Containers) on both Zenoss with exactly the same pattern whereas the following processes are modeled on both but alerting as down on v3.2.1 only:

zCountProcs set to false on both (tried also true).

- pattern=ora_dbw - only one process (ora_dbw1) is up wheras all others according the pattern (ora_dbw0, 2 - n) are down.
- pattern=snmpd - says down which is not true
- pattern=^sched (Despite zsched processes are discovered?) - snmpd - says down which is not true

I think v3.2.1 does not deal properly with OS Processes as soon a pattern matches more than one process (e.g. On a global zone there are also all the container processes visible.

Just found a difference on IgnoreParameters, maybe due to trying several settings to fix this.Also after makeing it equal there is no change in alerting behaviour.

Another strange behaviour is, when i change globally the name/pattern of a process, on the device the process class does not reflect this. (snmp >> snmpd, still process class is snmp)

Report Abuse

Like (0)

Go to original post

Legend

Correct Answers - 4 points
Helpful Answers - 2 points